lnu.sePublications
Change search
Link to record
Permanent link

Direct link
Ericsson, Morgan, DocentORCID iD iconorcid.org/0000-0003-1173-5187
Publications (10 of 81) Show all publications
Hönel, S., Ericsson, M., Löwe, W. & Wingkvist, A. (2023). Metrics As Scores: A Tool- and Analysis Suite and Interactive Application for Exploring Context-Dependent Distributions. Journal of Open Source Software, 8(88), Article ID 4913.
Open this publication in new window or tab >>Metrics As Scores: A Tool- and Analysis Suite and Interactive Application for Exploring Context-Dependent Distributions
2023 (English)In: Journal of Open Source Software, E-ISSN 2475-9066, Vol. 8, no 88, article id 4913Article in journal (Refereed) Published
Abstract [en]

Metrics As Scores can be thought of as an interactive, multiple analysis of variance (abbr. "ANOVA," Chambers et al., 2017). An ANOVA might be used to estimate the goodness-of-fit of a statistical model. Beyond ANOVA, which is used to analyze the differences among hypothesized group means for a single quantity (feature), Metrics As Scores seeks to answer the question of whether a sample of a certain feature is more or less common across groups. This approach to data visualization and -exploration has been used previously (e.g., Jiang etal., 2022). Beyond this, Metrics As Scores can determine what might constitute a good/bad, acceptable/alarming, or common/extreme value, and how distant the sample is from that value, for each group. This is expressed in terms of a percentile (a standardized scale of [0, 1]), which we call score. Considering all available features among the existing groups furthermore allows the user to assess how different the groups are from each other, or whether they are indistinguishable from one another. The name Metrics As Scores was derived from its initial application: examining differences of software metrics across application domains (Hönel et al., 2022). A software metric is an aggregation of one or more raw features according to some well-defined standard, method, or calculation. In software processes, such aggregations are often counts of events or certain properties (Florac & Carleton, 1999). However, without the aggregation that is done in a quality model, raw data (samples) and software metrics are rarely of great value to analysts and decision-makers. This is because quality models are conceived to establish a connection between software metrics and certain quality goals (Kaner & Bond, 2004). It is, therefore, difficult to answer the question "is my metric value good?". With Metrics As Scores we present an approach that, given some ideal value, can transform any sample into a score, given a sample of sufficiently many relevant values. While such ideal values for software metrics were previously attempted to be derived from, e.g., experience or surveys (Benlarbi et al., 2000), benchmarks (Alves et al., 2010), or by setting practical values (Grady, 1992), with Metrics As Scores we suggest deriving ideal values additionally in non-parametric, statistical ways. To do so, data first needs to be captured in a relevant context (group). A feature value might be good in one context, while it is less so in another. Therefore, we suggest generalizing and contextualizing the approach taken by Ulan et al. (2021), in which a score is defined to always have a range of [0, 1] and linear behavior. This means that scores can now also be compared and that a fixed increment in any score is equally valuable among scores. This is not the case for raw features, otherwise. Metrics As Scores consists of a tool- and analysis suite and an interactive application that allows researchers to explore and understand differences in scores across groups. The operationalization of features as scores lies in gathering values that are context-specific (group-typical), determining an ideal value non-parametrically or by user preference, and then transforming the observed values into distances. Metrics As Scores enables this procedure by unifying the way of obtaining probability densities/masses and conducting appropriate statistical tests. More than 120 different parametric distributions (approx. 20 of which are discrete) are fitted through a common interface. Those distributions are part of the scipy package for the Python programming language, which Metrics As Scores makes extensive use of (Virtanen et al., 2020). While fitting continuous distributions is straightforward using maximum likelihood estimation, many discrete distributions have integral parameters. For these, Metrics As Scores solves a mixed-variable global optimization problem using a genetic algorithm in pymoo (Blank& Deb, 2020). Additionally to that, empirical distributions (continuous and discrete) and smooth approximate kernel density estimates are available. Applicable statistical tests for assessing the goodness-of-fit are automatically performed. These tests are used to select some best-fitting random variable in the interactive web application. As an application written in Python, Metrics As Scores is made available as a package that is installable using the PythonPackage Index (PyPI): pip install metrics-as-scores. As such, the application can be used in a stand-alone manner and does not require additional packages, such as a web server or third-party libraries.

Place, publisher, year, edition, pages
Open Journals, 2023
Keywords
Metrics, Visualization, Conditional Distributions
National Category
Probability Theory and Statistics Software Engineering
Research subject
Statistics/Econometrics; Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-124881 (URN)10.21105/joss.04913 (DOI)
Available from: 2023-09-25 Created: 2023-09-25 Last updated: 2024-02-28Bibliographically approved
Olsson, T., Ericsson, M. & Wingkvist, A. (2023). Optimized Machine Learning Input for Evolutionary Source Code to Architecture Mapping. In: Batista, T., Bureš, T., Raibulet, C., Muccini, H. (Ed.), Software Architecture. ECSA 2022 Tracks and Workshops. ECSA 2022: . Paper presented at Software Architecture. ECSA 2022 Tracks and Workshops Prague, Czech Republic, September 19–23, 2022 (pp. 421-435). Springer
Open this publication in new window or tab >>Optimized Machine Learning Input for Evolutionary Source Code to Architecture Mapping
2023 (English)In: Software Architecture. ECSA 2022 Tracks and Workshops. ECSA 2022 / [ed] Batista, T., Bureš, T., Raibulet, C., Muccini, H., Springer, 2023, p. 421-435Conference paper, Published paper (Refereed)
Abstract [en]

Automatically mapping source code to architectural modules is an interesting and difficult problem. Mapping can be considered a classification problem, and machine learning approaches have been used to automatically generate mappings. Feature engineering is an essential element of machine learning. We study which source code features are important for an algorithm to function effectively. Additionally, we examine stemming and data cleaning. We systematically evaluate various combinations of features on five datasets created from JabRef, TeamMates, ProM, and two Hadoop subsystems. The systems are open-source with well-established mappings. We find that no single set of features consistently provides the highest performance, and even the subsystems of Hadoop have varied optimal feature combinations. Stemming provided minimal benefit, and cleaning the data is not worth the effort, as it also provided minimal benefit.

Place, publisher, year, edition, pages
Springer, 2023
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 13928
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-126305 (URN)10.1007/978-3-031-36889-9_28 (DOI)9783031368882 (ISBN)9783031368899 (ISBN)
Conference
Software Architecture. ECSA 2022 Tracks and Workshops Prague, Czech Republic, September 19–23, 2022
Available from: 2024-01-09 Created: 2024-01-09 Last updated: 2024-02-01Bibliographically approved
Hönel, S., Ericsson, M., Löwe, W. & Wingkvist, A. (2022). Contextual Operationalization of Metrics as Scores: Is My Metric Value Good?. In: Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS): . Paper presented at 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), Guangzhou, China, 5-9 Dec. 2022 (pp. 333-343). IEEE
Open this publication in new window or tab >>Contextual Operationalization of Metrics as Scores: Is My Metric Value Good?
2022 (English)In: Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), IEEE, 2022, p. 333-343Conference paper, Published paper (Refereed)
Abstract [en]

Software quality models aggregate metrics to indicate quality. Most metrics reflect counts derived from events or attributes that cannot directly be associated with quality. Worse, what constitutes a desirable value for a metric may vary across contexts. We demonstrate an approach to transforming arbitrary metrics into absolute quality scores by leveraging metrics captured from similar contexts. In contrast to metrics, scores represent freestanding quality properties that are also comparable. We provide a web-based tool for obtaining contextualized scores for metrics as obtained from one’s software. Our results indicate that significant differences among various metrics and contexts exist. The suggested approach works with arbitrary contexts. Given sufficient contextual information, it allows for answering the question of whether a metric value is good/bad or common/extreme.

Place, publisher, year, edition, pages
IEEE, 2022
Series
IEEE International Conference on Software Quality, Reliability and Security (QRS), ISSN 2693-9185, E-ISSN 2693-9177
Keywords
Software quality, Metrics, Scores, Software Domains, Measurement, Aggregates, Software quality, Software reliability, Security, software metrics, absolute quality scores, arbitrary metrics, contextual operationalization, contextualized scores, quality properties, software quality models, Web-based tool
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer and Information Sciences Computer Science; Computer Science, Software Technology; Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-120165 (URN)10.1109/QRS57517.2022.00042 (DOI)2-s2.0-85151404427 (Scopus ID)9781665477048 (ISBN)
Conference
2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), Guangzhou, China, 5-9 Dec. 2022
Available from: 2023-04-12 Created: 2023-04-12 Last updated: 2023-09-27Bibliographically approved
Olsson, T., Ericsson, M. & Wingkvist, A. (2022). Mapping Source Code to Modular Architectures Using Keywords. In: Scandurra, P., Galster, M., Mirandola, R., Weyns, D. (Ed.), Software Architecture. ECSA 2021: . Paper presented at European Conference on Software Architecture, ECSA 2021, Virtual, Online, 13-17 September 2021 (pp. 65-85). Springer
Open this publication in new window or tab >>Mapping Source Code to Modular Architectures Using Keywords
2022 (English)In: Software Architecture. ECSA 2021 / [ed] Scandurra, P., Galster, M., Mirandola, R., Weyns, D., Springer, 2022, p. 65-85Conference paper, Published paper (Refereed)
Abstract [en]

We implement an automatic mapper that can find the corresponding architectural module for a source code file. The mapper is based on multinomial naive Bayes, and it is trained using custom keywords for each architectural module. The mapper uses the path and file name of source code elements for prediction. We find that the needed keywords often match the module names; however, ambiguities and discrepancies exist. We evaluate the mapper using ten open-source systems with a mapping to an intended architecture and find that the mapper can successfully create a mapping with perfect precision. Still, it cannot cover all source code elements in most cases. However, other techniques can use the mapping as a foothold and automatically create further mappings. We also apply the approach to two cases where the architecture has been recovered from the implementation and find that the approach currently has limitations of applicability in such architectures. 

Place, publisher, year, edition, pages
Springer, 2022
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13365
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-118118 (URN)10.1007/978-3-031-15116-3_4 (DOI)000874750000004 ()2-s2.0-85136962515 (Scopus ID)9783031151156 (ISBN)9783031151163 (ISBN)
Conference
European Conference on Software Architecture, ECSA 2021, Virtual, Online, 13-17 September 2021
Available from: 2023-01-03 Created: 2023-01-03 Last updated: 2023-03-30Bibliographically approved
Picha, P., Hönel, S., Brada, P., Ericsson, M., Löwe, W., Wingkvist, A. & Danek, J. (2022). Process anti-pattern detection: a case study. In: Proceedings of the 27th European Conference on Pattern Languages of Programs, EuroPLop 2022, Irsee, Germany, July 6-10, 2022: . Paper presented at EuroPLop '22: Proceedings of the 27th European Conference on Pattern Languages of Programs, Irsee, Germany, July 6-10, 2022 (pp. 1-18). ACM Publications, Article ID 5.
Open this publication in new window or tab >>Process anti-pattern detection: a case study
Show others...
2022 (English)In: Proceedings of the 27th European Conference on Pattern Languages of Programs, EuroPLop 2022, Irsee, Germany, July 6-10, 2022, ACM Publications, 2022, p. 1-18, article id 5Conference paper, Published paper (Refereed)
Abstract [en]

Anti-patterns are harmful phenomena repeatedly occurring, e.g., in software development projects. Though widely recognized and well-known, their descriptions are traditionally not fit for automated detection. The detection is usually performed by manual audits, or on business process models. Both options are time-, effort- and expertise-heavy, prone to biases, and/or omissions. Meanwhile, collaborative software projects produce much data as a natural side product, capturing their status and day-to-day history. Long-term, our research aims at deriving models for the automated detection of process and project management anti-patterns, applicable to project data. Here, we present a general approach for studies investigating occurrences of these types of anti-patterns in projects and discuss the entire process of such studies in detail, starting from the anti-pattern descriptions in literature. We demonstrate and verify our approach with the Fire Drill anti-pattern detection as a case study, applying it to data from 15 student projects. The results of our study suggest that reliable detection of at least some process and project management anti-patterns in project data is possible, with 13 projects assessed accurately for Fire Drill presence by our automated detection when compared to the ground truth gathered from independent data. The overall approach can be similarly applied to detecting patterns and other phenomena with manifestations in Application Lifecycle Management data.

Place, publisher, year, edition, pages
ACM Publications, 2022
Keywords
Pattern detection, Project management anti-patterns, Software process anti-patterns, ALM tools, Fire Drill
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-120164 (URN)10.1145/3551902.3551965 (DOI)2-s2.0-85148442751 (Scopus ID)9781450395946 (ISBN)
Conference
EuroPLop '22: Proceedings of the 27th European Conference on Pattern Languages of Programs, Irsee, Germany, July 6-10, 2022
Available from: 2023-04-12 Created: 2023-04-12 Last updated: 2023-09-27Bibliographically approved
Olsson, T., Ericsson, M. & Wingkvist, A. (2022). To automatically map source code entities to architectural modules with Naive Bayes. Journal of Systems and Software, 183, Article ID 111095.
Open this publication in new window or tab >>To automatically map source code entities to architectural modules with Naive Bayes
2022 (English)In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 183, article id 111095Article in journal (Refereed) Published
Abstract [en]

Background: The process of mapping a source code entity onto an architectural module is to a large degree a manual task. Automating this process could increase the use of static architecture conformance checking methods, such as reflexion modeling, in industry. Current techniques rely on user parameterization and a highly cohesive design. A machine learning approach would potentially require fewer parameters and better use of the available information to aid in automatic mapping.

Aim: We investigate how a classifier can be trained to map from source code to architecture modules automatically. This classifier is trained with semantic and syntactic dependency information extracted from the source code and from architecture descriptions. The classifier is implemented using multinomial naive Bayes and evaluated.

Method: We perform experiments and compare the classifier with three state-of-the-art mapping functions in eight open-source Java systems with known ground-truth-mappings.

Results: We find that the classifier outperforms the state-of-the-art in all cases and that it provides a useful baseline for further research in the area of semi-automatic incremental clustering.

Conclusions: We conclude that machine learning is a useful approach that performs better and with less need for parameterization compared to other approaches. Future work includes investigating problematic mappings and a more diverse set of subject systems.

Place, publisher, year, edition, pages
Elsevier, 2022
Keywords
Incremental clustering, Orphan adoption, Naive Bayes, Software architecture, Machine learning
National Category
Software Engineering
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-107510 (URN)10.1016/j.jss.2021.111095 (DOI)000709898800002 ()2-s2.0-85117148740 (Scopus ID)
Available from: 2021-10-15 Created: 2021-10-15 Last updated: 2024-02-09Bibliographically approved
Olsson, T., Ericsson, M. & Wingkvist, A. (2021). A Preliminary Study on the Use of Key-words for Source Code to Architecture Mappings. In: Robert Heinrich, Raffaela Mirandola, Danny Weyns (Ed.), Companion Proceedings of the 15th European Conference on Software Architecture: . Paper presented at 15th European Conference on Software Architecture, ECSA Virtual (originally: Växjö, Sweden), 13-17 September, 2021. CEUR-WS.org, 2978
Open this publication in new window or tab >>A Preliminary Study on the Use of Key-words for Source Code to Architecture Mappings
2021 (English)In: Companion Proceedings of the 15th European Conference on Software Architecture / [ed] Robert Heinrich, Raffaela Mirandola, Danny Weyns, CEUR-WS.org , 2021, Vol. 2978Conference paper, Published paper (Refereed)
Abstract [en]

We implement an automatic mapper that can find the corresponding architectural module for a source code file. The mapperis based on multinomial naive Bayes, and it is trained using custom keywords for each architectural module. For prediction,the mapper uses the path and file name of source code elements. We find that the needed keywords often match the modulenames, but also that ambiguities and discrepancies exist. We evaluate the mapper using nine open-source systems and findthat the mapper can successfully create a mapping with perfect precision, but in most cases, it cannot cover all source codeelements. Other techniques can, however, use the mapping as a foothold and create further mappings.

Place, publisher, year, edition, pages
CEUR-WS.org, 2021
Series
CEUR Workshop Proceedings, ISSN 1613-0073 ; 2978
Keywords
Incremental clustering, Orphan adoption, Naive Bayes, Software architecture, Machine learning
National Category
Software Engineering
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-110143 (URN)2-s2.0-85117820744 (Scopus ID)
Conference
15th European Conference on Software Architecture, ECSA Virtual (originally: Växjö, Sweden), 13-17 September, 2021
Available from: 2022-02-04 Created: 2022-02-04 Last updated: 2023-04-12Bibliographically approved
Ulan, M., Löwe, W., Ericsson, M. & Wingkvist, A. (2021). Copula-based software metrics aggregation. Software quality journal, 29, 863-899
Open this publication in new window or tab >>Copula-based software metrics aggregation
2021 (English)In: Software quality journal, ISSN 0963-9314, E-ISSN 1573-1367, Vol. 29, p. 863-899Article in journal (Refereed) Published
Abstract [en]

A quality model is a conceptual decomposition of an abstract notion of quality into relevant, possibly conflicting characteristics and further into measurable metrics. For quality assessment and decision making, metrics values are aggregated to characteristics and ultimately to quality scores. Aggregation has often been problematic as quality models do not provide the semantics of aggregation. This makes it hard to formally reason about metrics, characteristics, and quality. We argue that aggregation needs to be interpretable and mathematically well defined in order to assess, to compare, and to improve quality. To address this challenge, we propose a probabilistic approach to aggregation and define quality scores based on joint distributions of absolute metrics values. To evaluate the proposed approach and its implementation under realistic conditions, we conduct empirical studies on bug prediction of ca. 5000 software classes, maintainability of ca. 15000 open-source software systems, and on the information quality of ca. 100000 real-world technical documents. We found that our approach is feasible, accurate, and scalable in performance.

Place, publisher, year, edition, pages
Springer, 2021
Keywords
Quality assessment, Quantitative methods, Software metrics, Aggregation, Multivariate statistical methods, Probabilistic models, Copula
National Category
Software Engineering
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-106779 (URN)10.1007/s11219-021-09568-9 (DOI)000687914800001 ()2-s2.0-85113308308 (Scopus ID)2021 (Local ID)2021 (Archive number)2021 (OAI)
Available from: 2021-09-03 Created: 2021-09-03 Last updated: 2021-12-23Bibliographically approved
Olsson, T., Ericsson, M. & Wingkvist, A. (2021). Hard Cases in Source Code to Architecture Mapping using Naive Bayes. In: Robert Heinrich, Raffaela Mirandola, Danny Weyns (Ed.), Companion Proceedings of the 15th European Conference on Software Architecture: ECSA 2021 Companion Volume Virtual (originally: Växjö, Sweden), 13-17 September, 2021. Paper presented at 15th European Conference on Software Architecture, ECSA 2021 Companion Volume Virtual (originally: Växjö, Sweden), 13-17 September, 2021. CEUR-WS.org
Open this publication in new window or tab >>Hard Cases in Source Code to Architecture Mapping using Naive Bayes
2021 (English)In: Companion Proceedings of the 15th European Conference on Software Architecture: ECSA 2021 Companion Volume Virtual (originally: Växjö, Sweden), 13-17 September, 2021 / [ed] Robert Heinrich, Raffaela Mirandola, Danny Weyns, CEUR-WS.org , 2021Conference paper, Published paper (Refereed)
Abstract [en]

The automatic mapping of source code entities to architectural modules is a challenging problem that is necessary to solve if we want to increase the use of Static Architecture Conformance Checking in the industry. We apply the state-of-the-art automatic mapping technique to eight open-source systems and find that there are systematic problems in the automatically created mappings. All of these eight systems have small modules that are very hard to map correctly since only a few source code entities are mapped to these. All systems seem to use some naming strategy, mapping source code to modules; however, naming is often ambiguous. We also find differences in ground truth mappings performed by experts, which affect mappings based on these, and that architectural refactoring also affects the mapping performance. 

Place, publisher, year, edition, pages
CEUR-WS.org, 2021
Series
CEUR Workshop Proceedings, E-ISSN 1613-0073
Keywords
Incremental clustering, Orphan adoption, Naive Bayes, Software architecture, Machine learning
National Category
Software Engineering
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-110145 (URN)2-s2.0-85117859228 (Scopus ID)
Conference
15th European Conference on Software Architecture, ECSA 2021 Companion Volume Virtual (originally: Växjö, Sweden), 13-17 September, 2021
Available from: 2022-02-04 Created: 2022-02-04 Last updated: 2023-04-12Bibliographically approved
Olsson, T., Ericsson, M. & Wingkvist, A. (2021). Optimized Dependency Weights in Source Code Clustering. In: Biffl, S Navarro, E Lowe, W Sirjani, M Mirandola, R Weyns, D (Ed.), Software Architecture, ECSA 2021: . Paper presented at 15th European Conference, ECSA 2021, Virtual Event, Sweden, September 13-17, 2021 (pp. 223-239). Springer, 12857
Open this publication in new window or tab >>Optimized Dependency Weights in Source Code Clustering
2021 (English)In: Software Architecture, ECSA 2021 / [ed] Biffl, S Navarro, E Lowe, W Sirjani, M Mirandola, R Weyns, D, Springer, 2021, Vol. 12857, p. 223-239Conference paper, Published paper (Refereed)
Abstract [en]

Some methods use the dependencies between source code entities to perform clustering to, e.g., automatically map to an intended modular architecture or reconstruct the implemented software architecture. However, there are many different ways that source code entities can depend on each other in an object-oriented system, and it is not likely that all dependencies are equally useful. We investigate how well an optimized set of weights for 14 different types of dependencies perform when automatically mapping source code to modules using an established mapping technique. The optimized weights were found using genetic optimization. We compare the F1 score of precision and recall to uniform weights and weights computed by module relation ratio in eight open-source systems to evaluate performance. Our experiments show that optimized weights significantly outperform the others, especially in systems that seem not to have been designed using the low coupling, high cohesion principle. We also find that dependencies based on method calls are not useful for automatic mapping in any of the eight systems.

Place, publisher, year, edition, pages
Springer, 2021
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords
Orphan adoption, Software architecture, Incremental clustering, Corrective clustering, Source code dependencies
National Category
Software Engineering
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-108067 (URN)10.1007/978-3-030-86044-8_16 (DOI)000696174400016 ()2-s2.0-85115124435 (Scopus ID)9783030860448 (ISBN)9783030860431 (ISBN)
Conference
15th European Conference, ECSA 2021, Virtual Event, Sweden, September 13-17, 2021
Available from: 2021-11-16 Created: 2021-11-16 Last updated: 2023-04-12Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-1173-5187

Search in DiVA

Show all publications