lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Quantifying Process Quality: The Role of Effective Organizational Learning in Software Evolution
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (DISA;DSIQ;DISTA)
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Real-world software applications must constantly evolve to remain relevant. This evolution occurs when developing new applications or adapting existing ones to meet new requirements, make corrections, or incorporate future functionality. Traditional methods of software quality control involve software quality models and continuous code inspection tools. These measures focus on directly assessing the quality of the software. However, there is a strong correlation and causation between the quality of the development process and the resulting software product. Therefore, improving the development process indirectly improves the software product, too. To achieve this, effective learning from past processes is necessary, often embraced through post mortem organizational learning. While qualitative evaluation of large artifacts is common, smaller quantitative changes captured by application lifecycle management are often overlooked. In addition to software metrics, these smaller changes can reveal complex phenomena related to project culture and management. Leveraging these changes can help detect and address such complex issues.

Software evolution was previously measured by the size of changes, but the lack of consensus on a reliable and versatile quantification method prevents its use as a dependable metric. Different size classifications fail to reliably describe the nature of evolution. While application lifecycle management data is rich, identifying which artifacts can model detrimental managerial practices remains uncertain. Approaches such as simulation modeling, discrete events simulation, or Bayesian networks have only limited ability to exploit continuous-time process models of such phenomena. Even worse, the accessibility and mechanistic insight into such gray- or black-box models are typically very low. To address these challenges, we suggest leveraging objectively captured digital artifacts from application lifecycle management, combined with qualitative analysis, for efficient organizational learning. A new language-independent metric is proposed to robustly capture the size of changes, significantly improving the accuracy of change nature determination. The classified changes are then used to explore, visualize, and suggest maintenance activities, enabling solid prediction of malpractice presence and -severity, even with limited data. Finally, parts of the automatic quantitative analysis are made accessible, potentially replacing expert-based qualitative analysis in parts.

Place, publisher, year, edition, pages
Växjö: Linnaeus University Press, 2023.
Series
Linnaeus University Dissertations ; 504
Keywords [en]
Software Size, Software Metrics, Commit Classification, Maintenance Activities, Software Quality, Process Quality, Project Management, Organizational Learning, Machine Learning, Visualization, Optimization
National Category
Computer and Information Sciences Software Engineering Mathematical Analysis Probability Theory and Statistics
Research subject
Computer Science, Software Technology; Computer Science, Information and software visualization; Computer and Information Sciences Computer Science, Computer Science; Statistics/Econometrics
Identifiers
URN: urn:nbn:se:lnu:diva-124916DOI: 10.15626/LUD.504.2023ISBN: 9789180820738 (print)ISBN: 9789180820745 (electronic)OAI: oai:DiVA.org:lnu-124916DiVA, id: diva2:1800656
Public defence
2023-09-29, House D, D1136A, 351 95 Växjö, Växjö, 13:00 (English)
Opponent
Supervisors
Available from: 2023-09-28 Created: 2023-09-27 Last updated: 2024-05-06Bibliographically approved
List of papers
1. A changeset-based approach to assess source code density and developer efficacy
Open this publication in new window or tab >>A changeset-based approach to assess source code density and developer efficacy
2018 (English)In: ICSE '18 Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, IEEE, 2018, p. 220-221Conference paper, Published paper (Refereed)
Abstract [en]

The productivity of a (team of) developer(s) can be expressed as a ratio between effort and delivered functionality. Several different estimation models have been proposed. These are based on statistical analysis of real development projects; their accuracy depends on the number and the precision of data points. We propose a data-driven method to automate the generation of precise data points. Functionality is proportional to the code size and Lines of Code (LoC) is a fundamental metric of code size. However, code size and LoC are not well defined as they could include or exclude lines that do not affect the delivered functionality. We present a new approach to measure the density of code in software repositories. We demonstrate how the accuracy of development time spent in relation to delivered code can be improved when basing it on net-instead of the gross-size measurements. We validated our tool by studying ca. 1,650 open-source software projects.

Place, publisher, year, edition, pages
IEEE, 2018
Series
Proceedings of the IEEE-ACM International Conference on Software Engineering Companion, ISSN 2574-1926, E-ISSN 2574-1934
Keywords
Software Repositories, Clone Detection, Source code density, Effort estimation
National Category
Software Engineering
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-79016 (URN)10.1145/3183440.3195105 (DOI)000450109000080 ()2-s2.0-85049691648 (Scopus ID)978-1-4503-5663-3 (ISBN)
Conference
40th ACM/IEEE International Conference on Software Engineering (ICSE), MAY 27-JUN 03, 2018, Gothenburg, SWEDEN
Available from: 2018-12-06 Created: 2018-12-06 Last updated: 2023-09-27Bibliographically approved
2. Git Density: Analyze git repositories to extract the Source Code Density and other Commit Properties
Open this publication in new window or tab >>Git Density: Analyze git repositories to extract the Source Code Density and other Commit Properties
2020 (English)Other (Other academic)
Abstract [en]

Git Density (git-density) is a tool to analyze git-repositories with the goal of detecting the source code density. It was developed during the research phase of the short technical paper and poster "A changeset-based approach to assess source code density and developer efficacy" and has since been extended to support thorough analyses and insights.

Keywords
git, source code density, git-hours, software metrics
National Category
Computer Sciences Computer and Information Sciences
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-98140 (URN)10.5281/zenodo.2565238 (DOI)
Available from: 2020-09-23 Created: 2020-09-23 Last updated: 2023-09-28Bibliographically approved
3. Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities
Open this publication in new window or tab >>Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities
2019 (English)In: 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS) / [ed] Dr. David Shepherd, IEEE, 2019, p. 109-120Conference paper, Published paper (Refereed)
Abstract [en]

Commit classification, the automatic classification of the purpose of changes to software, can support the understanding and quality improvement of software and its development process. We introduce code density of a commit, a measure of the net size of a commit, as a novel feature and study how well it is suited to determine the purpose of a change. We also compare the accuracy of code-density-based classifications with existing size-based classifications. By applying standard classification models, we demonstrate the significance of code density for the accuracy of commit classification. We achieve up to 89% accuracy and a Kappa of 0.82 for the cross-project commit classification where the model is trained on one project and applied to other projects. Such highly accurate classification of the purpose of software changes helps to improve the confidence in software (process) quality analyses exploiting this classification information.

Place, publisher, year, edition, pages
IEEE, 2019
Keywords
Software Quality, Commit Classification, Source Code Density, Maintenance Activities
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-85473 (URN)10.1109/QRS.2019.00027 (DOI)000587580300014 ()2-s2.0-85073775119 (Scopus ID)9781728139272 (ISBN)9781728139289 (ISBN)
Conference
The 19th IEEE International Conference on Software Quality, Reliability, and Security, July 22-26, 2019, Sofia, Bulgaria
Available from: 2019-06-17 Created: 2019-06-17 Last updated: 2024-01-22Bibliographically approved
4. 359,569 commits with source code density; 1149 commits of which have software maintenance activity labels (adaptive, corrective, perfective)
Open this publication in new window or tab >>359,569 commits with source code density; 1149 commits of which have software maintenance activity labels (adaptive, corrective, perfective)
2019 (English)Data set
Keywords
Software Maintenance, Software Evolution, Mining Software Repositories, Predictive Modeling, Software Quality
National Category
Software Engineering Computer Sciences
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-98175 (URN)10.5281/zenodo.2590518 (DOI)
Note

The dataset contains these tables:

- x1151:

  - The original dataset from Levin and Yehudai.

  - despite its name, this dataset has only 1,149 commits, as two commits were duplicates in the original dataset.

  - This dataset spanned 11 projects, each of which had between 99 and 114 commitsThis dataset has 71 features and spans the projects RxJava, hbase, elasticsearch, intellij-community, hadoop, drools, Kotlin, restlet-framework-java, orientdb, camel and spring-framework.

- gtools_ex (short for Git-Tools, extended):

  - Contains 359,569 commits, analyzed using Git-Tools in extended mode

  - It spans all commits and projects from the x1151 dataset as well.

  - All 11 projects were analyzed, from the initial commit until the end of January 2019. For the projects Intellij and Kotlin, the first 35,000 resp. 30,000 commits were analyzed.

  - This dataset introduces 35 new features (see list below), 22 of which are size- or density-related.

The dataset contains these views:

- geX_L (short for Git-tools, extended, with labels): Joins the commits' labels from x1151 with the extended attributes from gtools_ex, using the commits' hashes.

- jeX_L (short for joined, extended, with labels): Joins the datasets x1151 and gtools_ex entirely, based on the commits' hashes.

 

Features of the gtools_ex dataset:

- SHA1

- RepoPathOrUrl

- AuthorName

- CommitterName

- AuthorTime (UTC)

- CommitterTime (UTC)

- MinutesSincePreviousCommit: Double, describing the amount of minutes that passed since the previous commit. Previous refers to the parent commit, not the previous in time.

- Message: The commit's message/comment

- AuthorEmail

- CommitterEmail

- AuthorNominalLabel: All authors of a repository are analyzed and merged by Git-Density using some heuristic, even if they do not always use the same email address or name. This label is a unique string that helps identifying the same author across commits, even if the author did not always use the exact same identity.

- CommitterNominalLabel: The same as AuthorNominalLabel, but for the committer this time.

- IsInitialCommit: A boolean indicating, whether a commit is preceded by a parent or not.

- IsMergeCommit: A boolean indicating whether a commit has more than one parent.

- NumberOfParentCommitsParentCommitSHA1s: A comma-concatenated string of the parents' SHA1 IDs

- NumberOfFilesAdded

- NumberOfFilesAddedNet: Like the previous property, but if the net-size of all changes of an added file is zero (i.e. when adding a file that is empty/whitespace or does not contain code), then this property does not count the file.

- NumberOfLinesAddedByAddedFiles

- NumberOfLinesAddedByAddedFilesNet: Like the previous property, but counts the net-lines

- NumberOfFilesDeleted

- NumberOfFilesDeletedNet: Like the previous property, but considers only files that had net-changes

- NumberOfLinesDeletedByDeletedFiles

- NumberOfLinesDeletedByDeletedFilesNet: Like the previous property, but counts the net-lines

- NumberOfFilesModified

- NumberOfFilesModifiedNet: Like the previous property, but considers only files that had net-changes

- NumberOfFilesRenamed

- NumberOfFilesRenamedNet: Like the previous property, but considers only files that had net-changes

- NumberOfLinesAddedByModifiedFiles

- NumberOfLinesAddedByModifiedFilesNet: Like the previous property, but counts the net-lines

- NumberOfLinesDeletedByModifiedFiles

- NumberOfLinesDeletedByModifiedFilesNet: Like the previous property, but counts the net-lines

- NumberOfLinesAddedByRenamedFiles

- NumberOfLinesAddedByRenamedFilesNet: Like the previous property, but counts the net-lines

- NumberOfLinesDeletedByRenamedFiles

- NumberOfLinesDeletedByRenamedFilesNet: Like the previous property, but counts the net-lines

- Density: The ratio between the two sums of all lines added+deleted+modified+renamed and their resp. gross-version. A density of zero means that the sum of net-lines is zero (i.e. all lines changes were just whitespace, comments etc.). A density of of 1 means that all changed net-lines contribute to the gross-size of the commit (i.e. no useless lines with e.g. only comments or whitespace).

- AffectedFilesRatioNet: The ratio between the sums of NumberOfFilesXXX and NumberOfFilesXXXNet

 

This dataset is supporting the paper "Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities", as submitted to the QRS2019 conference (The 19th IEEE International Conference on Software Quality, Reliability, and Security). Citation: Hönel, S., Ericsson, M., Löwe, W. and Wingkvist, A., 2019. Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities. In The 19th IEEE International Conference on Software Quality, Reliability, and Security.

The dataset was compressed using 7z and the PPMd algorithm.

Available from: 2020-09-25 Created: 2020-09-25 Last updated: 2024-01-22Bibliographically approved
5. Using source code density to improve the accuracy of automatic commit classification into maintenance activities
Open this publication in new window or tab >>Using source code density to improve the accuracy of automatic commit classification into maintenance activities
2020 (English)In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 168, p. 1-19, article id 110673Article in journal (Refereed) Published
Abstract [en]

Source code is changed for a reason, e.g., to adapt, correct, or adapt it. This reason can provide valuable insight into the development process but is rarely explicitly documented when the change is committed to a source code repository. Automatic commit classification uses features extracted from commits to estimate this reason.

We introduce source code density, a measure of the net size of a commit, and show how it improves the accuracy of automatic commit classification compared to previous size-based classifications. We also investigate how preceding generations of commits affect the class of a commit, and whether taking the code density of previous commits into account can improve the accuracy further.

We achieve up to 89% accuracy and a Kappa of 0.82 for the cross-project commit classification where the model is trained on one project and applied to other projects. Models trained on single projects yield accuracies of up to 93% with a Kappa approaching 0.90. The accuracy of the automatic commit classification has a direct impact on software (process) quality analyses that exploit the classification, so our improvements to the accuracy will also improve the confidence in such analyses.

Place, publisher, year, edition, pages
Elsevier, 2020
Keywords
Software quality, Commit classification, Source code density, Maintenance activities, Software evolution
National Category
Software Engineering Computer Sciences
Research subject
Computer Science, Software Technology; Computer and Information Sciences Computer Science, Computer Science; Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-95751 (URN)10.1016/j.jss.2020.110673 (DOI)000557871300021 ()2-s2.0-85085726544 (Scopus ID)
Available from: 2020-06-08 Created: 2020-06-08 Last updated: 2023-09-27Bibliographically approved
6. Exploiting Relations, Sojourn-Times, and Joint Conditional Probabilities for Automated Commit Classification
Open this publication in new window or tab >>Exploiting Relations, Sojourn-Times, and Joint Conditional Probabilities for Automated Commit Classification
2023 (English)In: Proceedings of the 18th International Conference on Software TechnologiesJuly 10-12, 2023, in Rome, Italy / [ed] Hans-Georg Fill, Francisco José Domínguez-Mayo, Marten van Sinderen, and Leszek A. Maciaszek., SciTePress, 2023, p. 323-331Conference paper, Published paper (Refereed)
Abstract [en]

The automatic classification of commits can be exploited for numerous applications, such as fault prediction, or determining maintenance activities. Additional properties, such as parent-child relations or sojourn-times between commits, were not previously considered for this task. However, such data cannot be leveraged well using traditional machine learning models, such as Random forests. Suitable models are, e.g., Conditional Random Fields or recurrent neural networks. We reason about the Markovian nature of the problem and propose models to address it. The first model is a generalized dependent mixture model, facilitating the Forward algorithm for 1st- and 2nd-order processes, using maximum likelihood estimation. We then propose a second, non-parametric model, that uses Bayesian segmentation and kernel density estimation, which can be effortlessly adapted to work with nth-order processes. Using an existing dataset with labeled commits as ground truth, we extend this dataset with relations between and sojourn-times of commits, by re-engineering the labeling rules first and meeting a high agreement between labelers. We show the strengths and weaknesses of either kind of model and demonstrate their ability to outperform the state-of-the-art in automated commit classification.

Place, publisher, year, edition, pages
SciTePress, 2023
Series
ICSOFT, ISSN 2184-2833
Keywords
Software Maintenance, Repository Mining, Maintenance Activities
National Category
Software Engineering
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-124879 (URN)10.5220/0012077300003538 (DOI)9789897586651 (ISBN)
Conference
18th International Conference on Software Technologies - ICSOFT 2023, Rome, Italy, July 10–12, 2023
Available from: 2023-09-25 Created: 2023-09-25 Last updated: 2024-05-06Bibliographically approved
7. Contextual Operationalization of Metrics as Scores: Is My Metric Value Good?
Open this publication in new window or tab >>Contextual Operationalization of Metrics as Scores: Is My Metric Value Good?
2022 (English)In: Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), IEEE, 2022, p. 333-343Conference paper, Published paper (Refereed)
Abstract [en]

Software quality models aggregate metrics to indicate quality. Most metrics reflect counts derived from events or attributes that cannot directly be associated with quality. Worse, what constitutes a desirable value for a metric may vary across contexts. We demonstrate an approach to transforming arbitrary metrics into absolute quality scores by leveraging metrics captured from similar contexts. In contrast to metrics, scores represent freestanding quality properties that are also comparable. We provide a web-based tool for obtaining contextualized scores for metrics as obtained from one’s software. Our results indicate that significant differences among various metrics and contexts exist. The suggested approach works with arbitrary contexts. Given sufficient contextual information, it allows for answering the question of whether a metric value is good/bad or common/extreme.

Place, publisher, year, edition, pages
IEEE, 2022
Series
IEEE International Conference on Software Quality, Reliability and Security (QRS), ISSN 2693-9185, E-ISSN 2693-9177
Keywords
Software quality, Metrics, Scores, Software Domains, Measurement, Aggregates, Software quality, Software reliability, Security, software metrics, absolute quality scores, arbitrary metrics, contextual operationalization, contextualized scores, quality properties, software quality models, Web-based tool
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer and Information Sciences Computer Science; Computer Science, Software Technology; Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-120165 (URN)10.1109/QRS57517.2022.00042 (DOI)2-s2.0-85151404427 (Scopus ID)9781665477048 (ISBN)
Conference
2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), Guangzhou, China, 5-9 Dec. 2022
Available from: 2023-04-12 Created: 2023-04-12 Last updated: 2023-09-27Bibliographically approved
8. Metrics As Scores: A Tool- and Analysis Suite and Interactive Application for Exploring Context-Dependent Distributions
Open this publication in new window or tab >>Metrics As Scores: A Tool- and Analysis Suite and Interactive Application for Exploring Context-Dependent Distributions
2023 (English)In: Journal of Open Source Software, E-ISSN 2475-9066, Vol. 8, no 88, article id 4913Article in journal (Refereed) Published
Abstract [en]

Metrics As Scores can be thought of as an interactive, multiple analysis of variance (abbr. "ANOVA," Chambers et al., 2017). An ANOVA might be used to estimate the goodness-of-fit of a statistical model. Beyond ANOVA, which is used to analyze the differences among hypothesized group means for a single quantity (feature), Metrics As Scores seeks to answer the question of whether a sample of a certain feature is more or less common across groups. This approach to data visualization and -exploration has been used previously (e.g., Jiang etal., 2022). Beyond this, Metrics As Scores can determine what might constitute a good/bad, acceptable/alarming, or common/extreme value, and how distant the sample is from that value, for each group. This is expressed in terms of a percentile (a standardized scale of [0, 1]), which we call score. Considering all available features among the existing groups furthermore allows the user to assess how different the groups are from each other, or whether they are indistinguishable from one another. The name Metrics As Scores was derived from its initial application: examining differences of software metrics across application domains (Hönel et al., 2022). A software metric is an aggregation of one or more raw features according to some well-defined standard, method, or calculation. In software processes, such aggregations are often counts of events or certain properties (Florac & Carleton, 1999). However, without the aggregation that is done in a quality model, raw data (samples) and software metrics are rarely of great value to analysts and decision-makers. This is because quality models are conceived to establish a connection between software metrics and certain quality goals (Kaner & Bond, 2004). It is, therefore, difficult to answer the question "is my metric value good?". With Metrics As Scores we present an approach that, given some ideal value, can transform any sample into a score, given a sample of sufficiently many relevant values. While such ideal values for software metrics were previously attempted to be derived from, e.g., experience or surveys (Benlarbi et al., 2000), benchmarks (Alves et al., 2010), or by setting practical values (Grady, 1992), with Metrics As Scores we suggest deriving ideal values additionally in non-parametric, statistical ways. To do so, data first needs to be captured in a relevant context (group). A feature value might be good in one context, while it is less so in another. Therefore, we suggest generalizing and contextualizing the approach taken by Ulan et al. (2021), in which a score is defined to always have a range of [0, 1] and linear behavior. This means that scores can now also be compared and that a fixed increment in any score is equally valuable among scores. This is not the case for raw features, otherwise. Metrics As Scores consists of a tool- and analysis suite and an interactive application that allows researchers to explore and understand differences in scores across groups. The operationalization of features as scores lies in gathering values that are context-specific (group-typical), determining an ideal value non-parametrically or by user preference, and then transforming the observed values into distances. Metrics As Scores enables this procedure by unifying the way of obtaining probability densities/masses and conducting appropriate statistical tests. More than 120 different parametric distributions (approx. 20 of which are discrete) are fitted through a common interface. Those distributions are part of the scipy package for the Python programming language, which Metrics As Scores makes extensive use of (Virtanen et al., 2020). While fitting continuous distributions is straightforward using maximum likelihood estimation, many discrete distributions have integral parameters. For these, Metrics As Scores solves a mixed-variable global optimization problem using a genetic algorithm in pymoo (Blank& Deb, 2020). Additionally to that, empirical distributions (continuous and discrete) and smooth approximate kernel density estimates are available. Applicable statistical tests for assessing the goodness-of-fit are automatically performed. These tests are used to select some best-fitting random variable in the interactive web application. As an application written in Python, Metrics As Scores is made available as a package that is installable using the PythonPackage Index (PyPI): pip install metrics-as-scores. As such, the application can be used in a stand-alone manner and does not require additional packages, such as a web server or third-party libraries.

Place, publisher, year, edition, pages
Open Journals, 2023
Keywords
Metrics, Visualization, Conditional Distributions
National Category
Probability Theory and Statistics Software Engineering
Research subject
Statistics/Econometrics; Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-124881 (URN)10.21105/joss.04913 (DOI)
Available from: 2023-09-25 Created: 2023-09-25 Last updated: 2024-05-06Bibliographically approved
9. Technical Reports Compilation: Detecting the Fire Drill Anti-pattern Using Source Code and Issue-Tracking Data
Open this publication in new window or tab >>Technical Reports Compilation: Detecting the Fire Drill Anti-pattern Using Source Code and Issue-Tracking Data
2023 (English)Report (Other academic)
Abstract [en]

Detecting the presence of project management anti-patterns (AP) currently requires experts on the matter and is an expensive endeavor. Worse, experts may introduce their individual subjectivity or bias. Using the Fire Drill AP, we first introduce a novel way to translate descriptions into detectable AP that are comprised of arbitrary metrics and events such as logged time or maintenance activities, which are mined from the underlying source code or issue-tracking data, thus making the description objective as it becomes data-based. Secondly, we demonstrate a novel method to quantify and score the deviations of real-world projects to data-based AP descriptions. Using fifteen real-world projects that exhibit a Fire Drill to some degree, we show how to further enhance the translated AP. The ground truth in these projects was extracted from two individual experts and consensus was found between them. We introduce a novel method called automatic calibration, that optimizes a pattern such that only necessary and important scores remain that suffice to confidently detect the degree to which the AP is present. Without automatic calibration, the proposed patterns show only weak potential for detecting the presence. Enriching the AP with data from real-world projects significantly improves the potential. We also introduce a no-pattern approach that exploits the ground truth for establishing a new, quantitative understanding of the phenomenon, as well as for finding gray-/black-box predictive models. We conclude that the presence detection and severity assessment of the Fire Drill anti-pattern, as well as some of its related and similar patterns, is certainly possible using some of the presented approaches.

Publisher
p. 338
National Category
Computer Sciences Software Engineering Probability Theory and Statistics
Research subject
Computer and Information Sciences Computer Science, Computer Science; Computer Science, Software Technology; Natural Science, Mathematics; Statistics/Econometrics
Identifiers
urn:nbn:se:lnu:diva-105772 (URN)10.48550/arXiv.2104.15090 (DOI)
Available from: 2021-07-07 Created: 2021-07-07 Last updated: 2023-09-28Bibliographically approved
10. Process anti-pattern detection: a case study
Open this publication in new window or tab >>Process anti-pattern detection: a case study
Show others...
2022 (English)In: Proceedings of the 27th European Conference on Pattern Languages of Programs, EuroPLop 2022, Irsee, Germany, July 6-10, 2022, ACM Publications, 2022, p. 1-18, article id 5Conference paper, Published paper (Refereed)
Abstract [en]

Anti-patterns are harmful phenomena repeatedly occurring, e.g., in software development projects. Though widely recognized and well-known, their descriptions are traditionally not fit for automated detection. The detection is usually performed by manual audits, or on business process models. Both options are time-, effort- and expertise-heavy, prone to biases, and/or omissions. Meanwhile, collaborative software projects produce much data as a natural side product, capturing their status and day-to-day history. Long-term, our research aims at deriving models for the automated detection of process and project management anti-patterns, applicable to project data. Here, we present a general approach for studies investigating occurrences of these types of anti-patterns in projects and discuss the entire process of such studies in detail, starting from the anti-pattern descriptions in literature. We demonstrate and verify our approach with the Fire Drill anti-pattern detection as a case study, applying it to data from 15 student projects. The results of our study suggest that reliable detection of at least some process and project management anti-patterns in project data is possible, with 13 projects assessed accurately for Fire Drill presence by our automated detection when compared to the ground truth gathered from independent data. The overall approach can be similarly applied to detecting patterns and other phenomena with manifestations in Application Lifecycle Management data.

Place, publisher, year, edition, pages
ACM Publications, 2022
Keywords
Pattern detection, Project management anti-patterns, Software process anti-patterns, ALM tools, Fire Drill
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-120164 (URN)10.1145/3551902.3551965 (DOI)2-s2.0-85148442751 (Scopus ID)9781450395946 (ISBN)
Conference
EuroPLop '22: Proceedings of the 27th European Conference on Pattern Languages of Programs, Irsee, Germany, July 6-10, 2022
Available from: 2023-04-12 Created: 2023-04-12 Last updated: 2023-09-27Bibliographically approved

Open Access in DiVA

Comprehensive summary(5477 kB)427 downloads
File information
File name FULLTEXT01.pdfFile size 5477 kBChecksum SHA-512
226da70bbf9025fddf5001c9f00eed590c299308ceee6a0d41eac7b21f9e68add797307b0e660213f922066147788815321ef55a6ef00e607e6279c1209d61b0
Type fulltextMimetype application/pdf

Other links

Publisher's full texthttps://doi.org/10.48550/arXiv.2305.18061Buy Book (SEK 250 + VAT and postage) lnupress@lnu.se

Authority records

Hönel, Sebastian

Search in DiVA

By author/editor
Hönel, Sebastian
By organisation
Department of computer science and media technology (CM)
Computer and Information SciencesSoftware EngineeringMathematical AnalysisProbability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 428 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 1209 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf