lnu.sePublikasjoner
Endre søk
Begrens søket
1 - 18 of 18
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    359,569 commits with source code density; 1149 commits of which have software maintenance activity labels (adaptive, corrective, perfective)2019Dataset
  • 2.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Efficient Automatic Change Detection in Software Maintenance and Evolutionary Processes2020Licentiatavhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    Software maintenance is such an integral part of its evolutionary process that it consumes much of the total resources available. Some estimate the costs of maintenance to be up to 100 times the amount of developing a software. A software not maintained builds up technical debt, and not paying off that debt timely will eventually outweigh the value of the software, if no countermeasures are undertaken. A software must adapt to changes in its environment, or to new and changed requirements. It must further receive corrections for emerging faults and vulnerabilities. Constant maintenance can prepare a software for the accommodation of future changes.

    While there may be plenty of rationale for future changes, the reasons behind historical changes may not be accessible longer. Understanding change in software evolution provides valuable insights into, e.g., the quality of a project, or aspects of the underlying development process. These are worth exploiting, for, e.g., fault prediction, managing the composition of the development team, or for effort estimation models. The size of software is a metric often used in such models, yet it is not well-defined. In this thesis, we seek to establish a robust, versatile and computationally cheap metric, that quantifies the size of changes made during maintenance. We operationalize this new metric and exploit it for automated and efficient commit classification.

    Our results show that the density of a commit, that is, the ratio between its net- and gross-size, is a metric that can replace other, more expensive metrics in existing classification models. Models using this metric represent the current state of the art in automatic commit classification. The density provides a more fine-grained and detailed insight into the types of maintenance activities in a software project.

    Additional properties of commits, such as their relation or intermediate sojourn-times, have not been previously exploited for improved classification of changes. We reason about the potential of these, and suggest and implement dependent mixture- and Bayesian models that exploit joint conditional densities, models that each have their own trade-offs with regard to computational cost and complexity, and prediction accuracy. Such models can outperform well-established classifiers, such as Gradient Boosting Machines.

    All of our empirical evaluation comprise large datasets, software and experiments, all of which we have published alongside the results as open-access. We have reused, extended and created datasets, and released software packages for change detection and Bayesian models used for all of the studies conducted.

    Fulltekst (pdf)
    Licentiate Thesis (Comprehensive Summary)
  • 3.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Exploiting Relations, Sojourn-Times, and Joint Conditional Probabilities for Automated Commit Classification2023Inngår i: Proceedings of the 18th International Conference on Software TechnologiesJuly 10-12, 2023, in Rome, Italy / [ed] Hans-Georg Fill, Francisco José Domínguez-Mayo, Marten van Sinderen, and Leszek A. Maciaszek., SciTePress, 2023, s. 323-331Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The automatic classification of commits can be exploited for numerous applications, such as fault prediction, or determining maintenance activities. Additional properties, such as parent-child relations or sojourn-times between commits, were not previously considered for this task. However, such data cannot be leveraged well using traditional machine learning models, such as Random forests. Suitable models are, e.g., Conditional Random Fields or recurrent neural networks. We reason about the Markovian nature of the problem and propose models to address it. The first model is a generalized dependent mixture model, facilitating the Forward algorithm for 1st- and 2nd-order processes, using maximum likelihood estimation. We then propose a second, non-parametric model, that uses Bayesian segmentation and kernel density estimation, which can be effortlessly adapted to work with nth-order processes. Using an existing dataset with labeled commits as ground truth, we extend this dataset with relations between and sojourn-times of commits, by re-engineering the labeling rules first and meeting a high agreement between labelers. We show the strengths and weaknesses of either kind of model and demonstrate their ability to outperform the state-of-the-art in automated commit classification.

    Fulltekst (pdf)
    fulltext
  • 4.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Git Density: Analyze git repositories to extract the Source Code Density and other Commit Properties2020Annet (Annet vitenskapelig)
    Abstract [en]

    Git Density (git-density) is a tool to analyze git-repositories with the goal of detecting the source code density. It was developed during the research phase of the short technical paper and poster "A changeset-based approach to assess source code density and developer efficacy" and has since been extended to support thorough analyses and insights.

    Download (zip)
    git-density-release-2020.2
  • 5.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    mmb: Arbitrary Dependency Mixed Multivariate Bayesian Models2020Annet (Fagfellevurdert)
    Abstract [en]

    The challenges posed by dependent variables in classification and regression using techniques based on Bayes' theorem are often avoided by assuming strong independence between the variables. Hence, such techniques are called naive. While analytical solutions supporting classification on arbitrary numbers of discrete and continuous random variables exist, practical solutions are scarce. This is true for Bayesian models that support regression and neighborhood search, likewise. To overcome the naive independence assumption, those models analytically resolve the dependencies using empirical joint conditional probabilities and joint conditional probability densities. These are obtained by posterior probabilities of the dependent variable after segmenting the dataset for each random variable's value.

    We demonstrate the advantages of these models: (i) they are deterministic, i.e., no randomization or weights and, hence, no training is required; (ii) each random variable may have an arbitrary probability distribution; and (iii) online learning is effortlessly possible. We evaluate a few Bayesian models empirically and assess their performance by comparing them against well-established classifiers and regression models, using well-known datasets. In classification, our models can outperform others in certain settings. In regression, our models deliver respectable performance without leading the field. Additionally, we provide a true statistical distance metric and a neighborhood search based on such models.

    Download (zip)
    R-mmb-cran-release-no1
  • 6.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Quantifying Process Quality: The Role of Effective Organizational Learning in Software Evolution2023Doktoravhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    Real-world software applications must constantly evolve to remain relevant. This evolution occurs when developing new applications or adapting existing ones to meet new requirements, make corrections, or incorporate future functionality. Traditional methods of software quality control involve software quality models and continuous code inspection tools. These measures focus on directly assessing the quality of the software. However, there is a strong correlation and causation between the quality of the development process and the resulting software product. Therefore, improving the development process indirectly improves the software product, too. To achieve this, effective learning from past processes is necessary, often embraced through post mortem organizational learning. While qualitative evaluation of large artifacts is common, smaller quantitative changes captured by application lifecycle management are often overlooked. In addition to software metrics, these smaller changes can reveal complex phenomena related to project culture and management. Leveraging these changes can help detect and address such complex issues.

    Software evolution was previously measured by the size of changes, but the lack of consensus on a reliable and versatile quantification method prevents its use as a dependable metric. Different size classifications fail to reliably describe the nature of evolution. While application lifecycle management data is rich, identifying which artifacts can model detrimental managerial practices remains uncertain. Approaches such as simulation modeling, discrete events simulation, or Bayesian networks have only limited ability to exploit continuous-time process models of such phenomena. Even worse, the accessibility and mechanistic insight into such gray- or black-box models are typically very low. To address these challenges, we suggest leveraging objectively captured digital artifacts from application lifecycle management, combined with qualitative analysis, for efficient organizational learning. A new language-independent metric is proposed to robustly capture the size of changes, significantly improving the accuracy of change nature determination. The classified changes are then used to explore, visualize, and suggest maintenance activities, enabling solid prediction of malpractice presence and -severity, even with limited data. Finally, parts of the automatic quantitative analysis are made accessible, potentially replacing expert-based qualitative analysis in parts.

    Fulltekst (pdf)
    Comprehensive summary
    Download (jpg)
    presentationsbild
  • 7.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Technical Reports Compilation: Detecting the Fire Drill Anti-pattern Using Source Code and Issue-Tracking Data2023Rapport (Annet vitenskapelig)
    Abstract [en]

    Detecting the presence of project management anti-patterns (AP) currently requires experts on the matter and is an expensive endeavor. Worse, experts may introduce their individual subjectivity or bias. Using the Fire Drill AP, we first introduce a novel way to translate descriptions into detectable AP that are comprised of arbitrary metrics and events such as logged time or maintenance activities, which are mined from the underlying source code or issue-tracking data, thus making the description objective as it becomes data-based. Secondly, we demonstrate a novel method to quantify and score the deviations of real-world projects to data-based AP descriptions. Using fifteen real-world projects that exhibit a Fire Drill to some degree, we show how to further enhance the translated AP. The ground truth in these projects was extracted from two individual experts and consensus was found between them. We introduce a novel method called automatic calibration, that optimizes a pattern such that only necessary and important scores remain that suffice to confidently detect the degree to which the AP is present. Without automatic calibration, the proposed patterns show only weak potential for detecting the presence. Enriching the AP with data from real-world projects significantly improves the potential. We also introduce a no-pattern approach that exploits the ground truth for establishing a new, quantitative understanding of the phenomenon, as well as for finding gray-/black-box predictive models. We conclude that the presence detection and severity assessment of the Fire Drill anti-pattern, as well as some of its related and similar patterns, is certainly possible using some of the presented approaches.

  • 8.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    A changeset-based approach to assess source code density and developer efficacy2018Inngår i: ICSE '18 Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, IEEE, 2018, s. 220-221Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The productivity of a (team of) developer(s) can be expressed as a ratio between effort and delivered functionality. Several different estimation models have been proposed. These are based on statistical analysis of real development projects; their accuracy depends on the number and the precision of data points. We propose a data-driven method to automate the generation of precise data points. Functionality is proportional to the code size and Lines of Code (LoC) is a fundamental metric of code size. However, code size and LoC are not well defined as they could include or exclude lines that do not affect the delivered functionality. We present a new approach to measure the density of code in software repositories. We demonstrate how the accuracy of development time spent in relation to delivered code can be improved when basing it on net-instead of the gross-size measurements. We validated our tool by studying ca. 1,650 open-source software projects.

  • 9.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Bayesian Regression on segmented data using Kernel Density Estimation2019Inngår i: 5th annual Big Data Conference: Linnaeus University, Växjö, Sweden, 5-6 December 2019, Zenodo , 2019Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    The challenge of having to deal with dependent variables in classification and regression using techniques based on Bayes' theorem is often avoided by assuming a strong independence between them, hence such techniques are said to be naive. While analytical solutions supporting classification on arbitrary amounts of discrete and continuous random variables exist, practical solutions are scarce. We are evaluating a few Bayesian models empirically and consider their computational complexity. To overcome the often assumed independence, those models attempt to resolve the dependencies using empirical joint conditional probabilities and joint conditional probability densities. These are obtained by posterior probabilities of the dependent variable after segmenting the dataset for each random variable's value. We demonstrate the advantages of these models, such as their nature being deterministic (no randomization or weights required), that no training is required, that each random variable may have any kind of probability distribution, how robustness is upheld without having to impute missing data, and that online learning is effortlessly possible. We compare such Bayesian models against well-established classifiers and regression models, using some well-known datasets. We conclude that our evaluated models can outperform other models in certain settings, using classification. The regression models deliver respectable performance, without leading the field.

  • 10.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Contextual Operationalization of Metrics as Scores: Is My Metric Value Good?2022Inngår i: Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), IEEE, 2022, s. 333-343Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Software quality models aggregate metrics to indicate quality. Most metrics reflect counts derived from events or attributes that cannot directly be associated with quality. Worse, what constitutes a desirable value for a metric may vary across contexts. We demonstrate an approach to transforming arbitrary metrics into absolute quality scores by leveraging metrics captured from similar contexts. In contrast to metrics, scores represent freestanding quality properties that are also comparable. We provide a web-based tool for obtaining contextualized scores for metrics as obtained from one’s software. Our results indicate that significant differences among various metrics and contexts exist. The suggested approach works with arbitrary contexts. Given sufficient contextual information, it allows for answering the question of whether a metric value is good/bad or common/extreme.

  • 11.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities2019Inngår i: 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS) / [ed] Dr. David Shepherd, IEEE, 2019, s. 109-120Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Commit classification, the automatic classification of the purpose of changes to software, can support the understanding and quality improvement of software and its development process. We introduce code density of a commit, a measure of the net size of a commit, as a novel feature and study how well it is suited to determine the purpose of a change. We also compare the accuracy of code-density-based classifications with existing size-based classifications. By applying standard classification models, we demonstrate the significance of code density for the accuracy of commit classification. We achieve up to 89% accuracy and a Kappa of 0.82 for the cross-project commit classification where the model is trained on one project and applied to other projects. Such highly accurate classification of the purpose of software changes helps to improve the confidence in software (process) quality analyses exploiting this classification information.

  • 12.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Metrics As Scores: A Tool- and Analysis Suite and Interactive Application for Exploring Context-Dependent Distributions2023Inngår i: Journal of Open Source Software, E-ISSN 2475-9066, Vol. 8, nr 88, artikkel-id 4913Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Metrics As Scores can be thought of as an interactive, multiple analysis of variance (abbr. "ANOVA," Chambers et al., 2017). An ANOVA might be used to estimate the goodness-of-fit of a statistical model. Beyond ANOVA, which is used to analyze the differences among hypothesized group means for a single quantity (feature), Metrics As Scores seeks to answer the question of whether a sample of a certain feature is more or less common across groups. This approach to data visualization and -exploration has been used previously (e.g., Jiang etal., 2022). Beyond this, Metrics As Scores can determine what might constitute a good/bad, acceptable/alarming, or common/extreme value, and how distant the sample is from that value, for each group. This is expressed in terms of a percentile (a standardized scale of [0, 1]), which we call score. Considering all available features among the existing groups furthermore allows the user to assess how different the groups are from each other, or whether they are indistinguishable from one another. The name Metrics As Scores was derived from its initial application: examining differences of software metrics across application domains (Hönel et al., 2022). A software metric is an aggregation of one or more raw features according to some well-defined standard, method, or calculation. In software processes, such aggregations are often counts of events or certain properties (Florac & Carleton, 1999). However, without the aggregation that is done in a quality model, raw data (samples) and software metrics are rarely of great value to analysts and decision-makers. This is because quality models are conceived to establish a connection between software metrics and certain quality goals (Kaner & Bond, 2004). It is, therefore, difficult to answer the question "is my metric value good?". With Metrics As Scores we present an approach that, given some ideal value, can transform any sample into a score, given a sample of sufficiently many relevant values. While such ideal values for software metrics were previously attempted to be derived from, e.g., experience or surveys (Benlarbi et al., 2000), benchmarks (Alves et al., 2010), or by setting practical values (Grady, 1992), with Metrics As Scores we suggest deriving ideal values additionally in non-parametric, statistical ways. To do so, data first needs to be captured in a relevant context (group). A feature value might be good in one context, while it is less so in another. Therefore, we suggest generalizing and contextualizing the approach taken by Ulan et al. (2021), in which a score is defined to always have a range of [0, 1] and linear behavior. This means that scores can now also be compared and that a fixed increment in any score is equally valuable among scores. This is not the case for raw features, otherwise. Metrics As Scores consists of a tool- and analysis suite and an interactive application that allows researchers to explore and understand differences in scores across groups. The operationalization of features as scores lies in gathering values that are context-specific (group-typical), determining an ideal value non-parametrically or by user preference, and then transforming the observed values into distances. Metrics As Scores enables this procedure by unifying the way of obtaining probability densities/masses and conducting appropriate statistical tests. More than 120 different parametric distributions (approx. 20 of which are discrete) are fitted through a common interface. Those distributions are part of the scipy package for the Python programming language, which Metrics As Scores makes extensive use of (Virtanen et al., 2020). While fitting continuous distributions is straightforward using maximum likelihood estimation, many discrete distributions have integral parameters. For these, Metrics As Scores solves a mixed-variable global optimization problem using a genetic algorithm in pymoo (Blank& Deb, 2020). Additionally to that, empirical distributions (continuous and discrete) and smooth approximate kernel density estimates are available. Applicable statistical tests for assessing the goodness-of-fit are automatically performed. These tests are used to select some best-fitting random variable in the interactive web application. As an application written in Python, Metrics As Scores is made available as a package that is installable using the PythonPackage Index (PyPI): pip install metrics-as-scores. As such, the application can be used in a stand-alone manner and does not require additional packages, such as a web server or third-party libraries.

    Fulltekst (pdf)
    fulltext
  • 13.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Using source code density to improve the accuracy of automatic commit classification into maintenance activities2020Inngår i: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 168, s. 1-19, artikkel-id 110673Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Source code is changed for a reason, e.g., to adapt, correct, or adapt it. This reason can provide valuable insight into the development process but is rarely explicitly documented when the change is committed to a source code repository. Automatic commit classification uses features extracted from commits to estimate this reason.

    We introduce source code density, a measure of the net size of a commit, and show how it improves the accuracy of automatic commit classification compared to previous size-based classifications. We also investigate how preceding generations of commits affect the class of a commit, and whether taking the code density of previous commits into account can improve the accuracy further.

    We achieve up to 89% accuracy and a Kappa of 0.82 for the cross-project commit classification where the model is trained on one project and applied to other projects. Models trained on single projects yield accuracies of up to 93% with a Kappa approaching 0.90. The accuracy of the automatic commit classification has a direct impact on software (process) quality analyses that exploit the classification, so our improvements to the accuracy will also improve the confidence in such analyses.

  • 14.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Picha, Petr
    University of Western Bohemia, Czechia.
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Brada, Premek
    University of Western Bohemia, Czechia.
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Activity-Based Detection of (Anti-)Patterns: An Embedded Case Study of the Fire Drill2024Inngår i: e-Informatica Software Engineering Journal, ISSN 1897-7979, E-ISSN 2084-4840, Vol. 18, nr 1, artikkel-id 240106Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Background: Nowadays, expensive, error-prone, expert-based evaluations are needed to identify and assess software process anti-patterns. Process artifacts cannot be automatically used to quantitatively analyze and train prediction models without exact ground truth. Aim: Develop a replicable methodology for organizational learning from process (anti-)patterns, demonstrating the mining of reliable ground truth and exploitation of process artifacts. Method: We conduct an embedded case study to find manifestations of the Fire Drill anti-pattern in n = 15 projects. To ensure quality, three human experts agree. Their evaluation and the process’ artifacts are utilized to establish a quantitative understanding and train a prediction model. Results: Qualitative review shows many project issues. (i) Expert assessments consistently provide credible ground truth. (ii) Fire Drill phenomenological descriptions match project activity time (for example, development). (iii) Regression models trained on ≈ 12–25 examples are sufficiently stable. Conclusion: The approach is data source-independent (source code or issue-tracking). It allows leveraging process artifacts for establishing additional phenomenon knowledge and training robust predictive models. The results indicate the aptness of the methodology for the identification of the Fire Drill and similar anti-pattern instances modeled using activities. Such identification could be used in post mortem process analysis supporting organizational learning for improving processes.

    Fulltekst (pdf)
    fulltext
  • 15.
    Hönel, Sebastian
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Pícha, Petr
    University of West Bohemia, Czech Republic.
    Brada, Premek
    University of West Bohemia, Czech Republic.
    Rychtarova, Lenka
    Independent.
    Danek, Jakub
    Independent.
    Detection of the Fire Drill anti-pattern: 15 real-world projects with ground truth, issue-tracking data, source code density, models and code2023Dataset
    Abstract [en]

    This package contains items for 9 real-world software projects. The data is supposed to aid the detection of the presence of the Fire Drill anti-pattern. We include data, ground truth, code, and notebooks. The data supports two distinct methods of detecting the AP: a) through issue-tracking data, and b) through the underlying source code. Therefore, this package includes the following:

    Original data:

    • For each project, its original artifacts (e.g., wikis, meeting minutes, mentor's notes, etc.)
    • Evaluation of raters' notes by the assessor

    Fire Drill in issue-tracking data:

    • Ground truth for whether and how strong each project exhibits the Fire Drill AP, on a scale from [0,10]. This was determined by two individual raters, who also reached a consensus.
    • Coefficients for indicators for the first method, per project.
    • Detailed issue-tracing data for each project: what occurred and when.
    • Time logs for each project.

    Fire Drill in source-code data:

    • Four technical reports that document the developed method of how to translate a description into a detectable pattern, and to use the pattern to detect the presence and to score it (similar to the rating). Also includes a report for how activities were assigned to individual commits.
    • Source code density data (metrics) for each commit in each of the nine projects as a separate dataset.
    • Code: a snapshot of the repository that holds all code, models, notebooks, and pre-computed results, for utmost reproducibility (the code is written in R).
  • 16.
    Picha, Petr
    et al.
    University of Western Bohemia, Czech Republic.
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Brada, Premek
    University of Western Bohemia, Czech Republic.
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Danek, Jakub
    University of Western Bohemia, Czech Republic.
    Process anti-pattern detection: a case study2022Inngår i: Proceedings of the 27th European Conference on Pattern Languages of Programs, EuroPLop 2022, Irsee, Germany, July 6-10, 2022, ACM Publications, 2022, s. 1-18, artikkel-id 5Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Anti-patterns are harmful phenomena repeatedly occurring, e.g., in software development projects. Though widely recognized and well-known, their descriptions are traditionally not fit for automated detection. The detection is usually performed by manual audits, or on business process models. Both options are time-, effort- and expertise-heavy, prone to biases, and/or omissions. Meanwhile, collaborative software projects produce much data as a natural side product, capturing their status and day-to-day history. Long-term, our research aims at deriving models for the automated detection of process and project management anti-patterns, applicable to project data. Here, we present a general approach for studies investigating occurrences of these types of anti-patterns in projects and discuss the entire process of such studies in detail, starting from the anti-pattern descriptions in literature. We demonstrate and verify our approach with the Fire Drill anti-pattern detection as a case study, applying it to data from 15 student projects. The results of our study suggest that reliable detection of at least some process and project management anti-patterns in project data is possible, with 13 projects assessed accurately for Fire Drill presence by our automated detection when compared to the ground truth gathered from independent data. The overall approach can be similarly applied to detecting patterns and other phenomena with manifestations in Application Lifecycle Management data.

  • 17.
    Ulan, Maria
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Martins, Rafael Messias
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Kerren, Andreas
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Artifact: Quality Models Inside Out: Interactive Visualization of Software Metrics by Means of Joint Probabilities2018Annet (Fagfellevurdert)
    Abstract [en]

    Assessing software quality, in general, is hard; each metric has a different interpretation, scale, range of values, or measurement method. Combining these metrics automatically is especially difficult, because they measure different aspects of software quality, and creating a single global final quality score limits the evaluation of the specific quality aspects and trade-offs that exist when looking at different metrics. We present a way to visualize multiple aspects of software quality. In general, software quality can be decomposed hierarchically into characteristics, which can be assessed by various direct and indirect metrics. These characteristics are then combined and aggregated to assess the quality of the software system as a whole. We introduce an approach for quality assessment based on joint distributions of metrics values. Visualizations of these distributions allow users to explore and compare the quality metrics of software systems and their artifacts, and to detect patterns, correlations, and anomalies. Furthermore, it is possible to identify common properties and flaws, as our visualization approach provides rich interactions for visual queries to the quality models’ multivariate data. We evaluate our approach in two use cases based on: 30 real-world technical documentation projects with 20,000 XML documents, and an open source project written in Java with 1000 classes. Our results show that the proposed approach allows an analyst to detect possible causes of bad or good quality.

  • 18.
    Ulan, Maria
    et al.
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Hönel, Sebastian
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Martins, Rafael Messias
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Ericsson, Morgan
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Löwe, Welf
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Wingkvist, Anna
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Kerren, Andreas
    Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap och medieteknik (DM).
    Quality Models Inside Out: Interactive Visualization of Software Metrics by Means of Joint Probabilities2018Inngår i: Proceedings of the 2018 Sixth IEEE Working Conference on Software Visualization, (VISSOFT), Madrid, Spain, 2018 / [ed] J. Ángel Velázquez Iturbide, Jaime Urquiza Fuentes, Andreas Kerren, and Mircea F. Lungu, IEEE, 2018, s. 65-75Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Assessing software quality, in general, is hard; each metric has a different interpretation, scale, range of values, or measurement method. Combining these metrics automatically is especially difficult, because they measure different aspects of software quality, and creating a single global final quality score limits the evaluation of the specific quality aspects and trade-offs that exist when looking at different metrics. We present a way to visualize multiple aspects of software quality. In general, software quality can be decomposed hierarchically into characteristics, which can be assessed by various direct and indirect metrics. These characteristics are then combined and aggregated to assess the quality of the software system as a whole. We introduce an approach for quality assessment based on joint distributions of metrics values. Visualizations of these distributions allow users to explore and compare the quality metrics of software systems and their artifacts, and to detect patterns, correlations, and anomalies. Furthermore, it is possible to identify common properties and flaws, as our visualization approach provides rich interactions for visual queries to the quality models’ multivariate data. We evaluate our approach in two use cases based on: 30 real-world technical documentation projects with 20,000 XML documents, and an open source project written in Java with 1000 classes. Our results show that the proposed approach allows an analyst to detect possible causes of bad or good quality.

1 - 18 of 18
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf