lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Metrics As Scores: A Tool- and Analysis Suite and Interactive Application for Exploring Context-Dependent Distributions
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (DISA;DSIQ;DISTA)ORCID iD: 0000-0001-7937-1645
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (DISA;DSIQ;DISTA)ORCID iD: 0000-0003-1173-5187
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (DISA;DSIQ;DISTA)ORCID iD: 0000-0002-7565-3714
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (DISA;DSIQ;DISTA)ORCID iD: 0000-0002-0835-823X
2023 (English)In: Journal of Open Source Software, E-ISSN 2475-9066, Vol. 8, no 88, article id 4913Article in journal (Refereed) Published
Abstract [en]

Metrics As Scores can be thought of as an interactive, multiple analysis of variance (abbr. "ANOVA," Chambers et al., 2017). An ANOVA might be used to estimate the goodness-of-fit of a statistical model. Beyond ANOVA, which is used to analyze the differences among hypothesized group means for a single quantity (feature), Metrics As Scores seeks to answer the question of whether a sample of a certain feature is more or less common across groups. This approach to data visualization and -exploration has been used previously (e.g., Jiang etal., 2022). Beyond this, Metrics As Scores can determine what might constitute a good/bad, acceptable/alarming, or common/extreme value, and how distant the sample is from that value, for each group. This is expressed in terms of a percentile (a standardized scale of [0, 1]), which we call score. Considering all available features among the existing groups furthermore allows the user to assess how different the groups are from each other, or whether they are indistinguishable from one another. The name Metrics As Scores was derived from its initial application: examining differences of software metrics across application domains (Hönel et al., 2022). A software metric is an aggregation of one or more raw features according to some well-defined standard, method, or calculation. In software processes, such aggregations are often counts of events or certain properties (Florac & Carleton, 1999). However, without the aggregation that is done in a quality model, raw data (samples) and software metrics are rarely of great value to analysts and decision-makers. This is because quality models are conceived to establish a connection between software metrics and certain quality goals (Kaner & Bond, 2004). It is, therefore, difficult to answer the question "is my metric value good?". With Metrics As Scores we present an approach that, given some ideal value, can transform any sample into a score, given a sample of sufficiently many relevant values. While such ideal values for software metrics were previously attempted to be derived from, e.g., experience or surveys (Benlarbi et al., 2000), benchmarks (Alves et al., 2010), or by setting practical values (Grady, 1992), with Metrics As Scores we suggest deriving ideal values additionally in non-parametric, statistical ways. To do so, data first needs to be captured in a relevant context (group). A feature value might be good in one context, while it is less so in another. Therefore, we suggest generalizing and contextualizing the approach taken by Ulan et al. (2021), in which a score is defined to always have a range of [0, 1] and linear behavior. This means that scores can now also be compared and that a fixed increment in any score is equally valuable among scores. This is not the case for raw features, otherwise. Metrics As Scores consists of a tool- and analysis suite and an interactive application that allows researchers to explore and understand differences in scores across groups. The operationalization of features as scores lies in gathering values that are context-specific (group-typical), determining an ideal value non-parametrically or by user preference, and then transforming the observed values into distances. Metrics As Scores enables this procedure by unifying the way of obtaining probability densities/masses and conducting appropriate statistical tests. More than 120 different parametric distributions (approx. 20 of which are discrete) are fitted through a common interface. Those distributions are part of the scipy package for the Python programming language, which Metrics As Scores makes extensive use of (Virtanen et al., 2020). While fitting continuous distributions is straightforward using maximum likelihood estimation, many discrete distributions have integral parameters. For these, Metrics As Scores solves a mixed-variable global optimization problem using a genetic algorithm in pymoo (Blank& Deb, 2020). Additionally to that, empirical distributions (continuous and discrete) and smooth approximate kernel density estimates are available. Applicable statistical tests for assessing the goodness-of-fit are automatically performed. These tests are used to select some best-fitting random variable in the interactive web application. As an application written in Python, Metrics As Scores is made available as a package that is installable using the PythonPackage Index (PyPI): pip install metrics-as-scores. As such, the application can be used in a stand-alone manner and does not require additional packages, such as a web server or third-party libraries.

Place, publisher, year, edition, pages
Open Journals , 2023. Vol. 8, no 88, article id 4913
Keywords [en]
Metrics, Visualization, Conditional Distributions
National Category
Probability Theory and Statistics Software Engineering
Research subject
Statistics/Econometrics; Computer Science, Information and software visualization
Identifiers
URN: urn:nbn:se:lnu:diva-124881DOI: 10.21105/joss.04913OAI: oai:DiVA.org:lnu-124881DiVA, id: diva2:1799945
Available from: 2023-09-25 Created: 2023-09-25 Last updated: 2024-05-06Bibliographically approved
In thesis
1. Quantifying Process Quality: The Role of Effective Organizational Learning in Software Evolution
Open this publication in new window or tab >>Quantifying Process Quality: The Role of Effective Organizational Learning in Software Evolution
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Real-world software applications must constantly evolve to remain relevant. This evolution occurs when developing new applications or adapting existing ones to meet new requirements, make corrections, or incorporate future functionality. Traditional methods of software quality control involve software quality models and continuous code inspection tools. These measures focus on directly assessing the quality of the software. However, there is a strong correlation and causation between the quality of the development process and the resulting software product. Therefore, improving the development process indirectly improves the software product, too. To achieve this, effective learning from past processes is necessary, often embraced through post mortem organizational learning. While qualitative evaluation of large artifacts is common, smaller quantitative changes captured by application lifecycle management are often overlooked. In addition to software metrics, these smaller changes can reveal complex phenomena related to project culture and management. Leveraging these changes can help detect and address such complex issues.

Software evolution was previously measured by the size of changes, but the lack of consensus on a reliable and versatile quantification method prevents its use as a dependable metric. Different size classifications fail to reliably describe the nature of evolution. While application lifecycle management data is rich, identifying which artifacts can model detrimental managerial practices remains uncertain. Approaches such as simulation modeling, discrete events simulation, or Bayesian networks have only limited ability to exploit continuous-time process models of such phenomena. Even worse, the accessibility and mechanistic insight into such gray- or black-box models are typically very low. To address these challenges, we suggest leveraging objectively captured digital artifacts from application lifecycle management, combined with qualitative analysis, for efficient organizational learning. A new language-independent metric is proposed to robustly capture the size of changes, significantly improving the accuracy of change nature determination. The classified changes are then used to explore, visualize, and suggest maintenance activities, enabling solid prediction of malpractice presence and -severity, even with limited data. Finally, parts of the automatic quantitative analysis are made accessible, potentially replacing expert-based qualitative analysis in parts.

Place, publisher, year, edition, pages
Växjö: Linnaeus University Press, 2023
Series
Linnaeus University Dissertations ; 504
Keywords
Software Size, Software Metrics, Commit Classification, Maintenance Activities, Software Quality, Process Quality, Project Management, Organizational Learning, Machine Learning, Visualization, Optimization
National Category
Computer and Information Sciences Software Engineering Mathematical Analysis Probability Theory and Statistics
Research subject
Computer Science, Software Technology; Computer Science, Information and software visualization; Computer and Information Sciences Computer Science, Computer Science; Statistics/Econometrics
Identifiers
urn:nbn:se:lnu:diva-124916 (URN)10.15626/LUD.504.2023 (DOI)9789180820738 (ISBN)9789180820745 (ISBN)
Public defence
2023-09-29, House D, D1136A, 351 95 Växjö, Växjö, 13:00 (English)
Opponent
Supervisors
Available from: 2023-09-28 Created: 2023-09-27 Last updated: 2024-05-06Bibliographically approved

Open Access in DiVA

fulltext(367 kB)318 downloads
File information
File name FULLTEXT02.pdfFile size 367 kBChecksum SHA-512
61c61798733741808a2b0053fdda37e96c38531f4aa749df0caedc3c29c88350cc988a01268e0190819b1cfa840a45bd5b5a4fb0442d4ced984bfb9413103a52
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Hönel, SebastianEricsson, MorganLöwe, WelfWingkvist, Anna

Search in DiVA

By author/editor
Hönel, SebastianEricsson, MorganLöwe, WelfWingkvist, Anna
By organisation
Department of computer science and media technology (CM)
In the same journal
Journal of Open Source Software
Probability Theory and StatisticsSoftware Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 345 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 433 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf