lnu.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Hönel, Sebastian
Publications (4 of 4) Show all publications
Hönel, S., Ericsson, M., Löwe, W. & Wingkvist, A. (2019). Bayesian Regression on segmented data using Kernel Density Estimation. In: 5th annual Big Data Conference: Linnaeus University, Växjö, Sweden, 5-6 December 2019. Paper presented at 5th annual Big Data Conference, Linnaeus University, Växjö, Sweden, 5-6 December 2019. Zenodo
Open this publication in new window or tab >>Bayesian Regression on segmented data using Kernel Density Estimation
2019 (English)In: 5th annual Big Data Conference: Linnaeus University, Växjö, Sweden, 5-6 December 2019, Zenodo , 2019Conference paper, Poster (with or without abstract) (Other academic)
Abstract [en]

The challenge of having to deal with dependent variables in classification and regression using techniques based on Bayes' theorem is often avoided by assuming a strong independence between them, hence such techniques are said to be naive. While analytical solutions supporting classification on arbitrary amounts of discrete and continuous random variables exist, practical solutions are scarce. We are evaluating a few Bayesian models empirically and consider their computational complexity. To overcome the often assumed independence, those models attempt to resolve the dependencies using empirical joint conditional probabilities and joint conditional probability densities. These are obtained by posterior probabilities of the dependent variable after segmenting the dataset for each random variable's value. We demonstrate the advantages of these models, such as their nature being deterministic (no randomization or weights required), that no training is required, that each random variable may have any kind of probability distribution, how robustness is upheld without having to impute missing data, and that online learning is effortlessly possible. We compare such Bayesian models against well-established classifiers and regression models, using some well-known datasets. We conclude that our evaluated models can outperform other models in certain settings, using classification. The regression models deliver respectable performance, without leading the field.

Place, publisher, year, edition, pages
Zenodo, 2019
Keywords
Bayes Theorem, Classification, Regression
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-90518 (URN)10.5281/zenodo.3571980 (DOI)
Conference
5th annual Big Data Conference, Linnaeus University, Växjö, Sweden, 5-6 December 2019
Available from: 2019-12-12 Created: 2019-12-12 Last updated: 2019-12-19Bibliographically approved
Hönel, S., Ericsson, M., Löwe, W. & Wingkvist, A. (2019). Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities. In: Dr. David Shepherd (Ed.), 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS): . Paper presented at The 19th IEEE International Conference on Software Quality, Reliability, and Security, July 22-26, 2019, Sofia, Bulgaria (pp. 109-120). IEEE
Open this publication in new window or tab >>Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities
2019 (English)In: 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS) / [ed] Dr. David Shepherd, IEEE, 2019, p. 109-120Conference paper, Published paper (Refereed)
Abstract [en]

Commit classification, the automatic classification of the purpose of changes to software, can support the understanding and quality improvement of software and its development process. We introduce code density of a commit, a measure of the net size of a commit, as a novel feature and study how well it is suited to determine the purpose of a change. We also compare the accuracy of code-density-based classifications with existing size-based classifications. By applying standard classification models, we demonstrate the significance of code density for the accuracy of commit classification. We achieve up to 89% accuracy and a Kappa of 0.82 for the cross-project commit classification where the model is trained on one project and applied to other projects. Such highly accurate classification of the purpose of software changes helps to improve the confidence in software (process) quality analyses exploiting this classification information.

Place, publisher, year, edition, pages
IEEE, 2019
Keywords
Software Quality, Commit Classification, Source Code Density, Maintenance Activities
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-85473 (URN)10.1109/QRS.2019.00027 (DOI)9781728139272 (ISBN)9781728139289 (ISBN)
Conference
The 19th IEEE International Conference on Software Quality, Reliability, and Security, July 22-26, 2019, Sofia, Bulgaria
Available from: 2019-06-17 Created: 2019-06-17 Last updated: 2020-01-29Bibliographically approved
Hönel, S., Ericsson, M., Löwe, W. & Wingkvist, A. (2018). A changeset-based approach to assess source code density and developer efficacy. In: ICSE '18 Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings: . Paper presented at 40th ACM/IEEE International Conference on Software Engineering (ICSE), MAY 27-JUN 03, 2018, Gothenburg, SWEDEN (pp. 220-221). IEEE
Open this publication in new window or tab >>A changeset-based approach to assess source code density and developer efficacy
2018 (English)In: ICSE '18 Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, IEEE , 2018, p. 220-221Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

The productivity of a (team of) developer(s) can be expressed as a ratio between effort and delivered functionality. Several different estimation models have been proposed. These are based on statistical analysis of real development projects; their accuracy depends on the number and the precision of data points. We propose a data-driven method to automate the generation of precise data points. Functionality is proportional to the code size and Lines of Code (LoC) is a fundamental metric of code size. However, code size and LoC are not well defined as they could include or exclude lines that do not affect the delivered functionality. We present a new approach to measure the density of code in software repositories. We demonstrate how the accuracy of development time spent in relation to delivered code can be improved when basing it on net-instead of the gross-size measurements. We validated our tool by studying ca. 1,650 open-source software projects.

Place, publisher, year, edition, pages
IEEE, 2018
Series
Proceedings of the IEEE-ACM International Conference on Software Engineering Companion, ISSN 2574-1926
Keywords
Software Repositories, Clone Detection, Source code density, Effort estimation
National Category
Software Engineering
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-79016 (URN)10.1145/3183440.3195105 (DOI)000450109000080 ()2-s2.0-85049691648 (Scopus ID)978-1-4503-5663-3 (ISBN)
Conference
40th ACM/IEEE International Conference on Software Engineering (ICSE), MAY 27-JUN 03, 2018, Gothenburg, SWEDEN
Available from: 2018-12-06 Created: 2018-12-06 Last updated: 2019-08-29Bibliographically approved
Ulan, M., Hönel, S., Martins, R. M., Ericsson, M., Löwe, W., Wingkvist, A. & Kerren, A. (2018). Quality Models Inside Out: Interactive Visualization of Software Metrics by Means of Joint Probabilities. In: J. Ángel Velázquez Iturbide, Jaime Urquiza Fuentes, Andreas Kerren, and Mircea F. Lungu (Ed.), Proceedings of the 2018 Sixth IEEE Working Conference on Software Visualization, (VISSOFT), Madrid, Spain, 2018: . Paper presented at IEEE Working Conference on Software Visualization (VISSOFT), Madrid, Spain, 24-25 September, 2018 (pp. 65-75). IEEE
Open this publication in new window or tab >>Quality Models Inside Out: Interactive Visualization of Software Metrics by Means of Joint Probabilities
Show others...
2018 (English)In: Proceedings of the 2018 Sixth IEEE Working Conference on Software Visualization, (VISSOFT), Madrid, Spain, 2018 / [ed] J. Ángel Velázquez Iturbide, Jaime Urquiza Fuentes, Andreas Kerren, and Mircea F. Lungu, IEEE, 2018, p. 65-75Conference paper, Published paper (Refereed)
Abstract [en]

Assessing software quality, in general, is hard; each metric has a different interpretation, scale, range of values, or measurement method. Combining these metrics automatically is especially difficult, because they measure different aspects of software quality, and creating a single global final quality score limits the evaluation of the specific quality aspects and trade-offs that exist when looking at different metrics. We present a way to visualize multiple aspects of software quality. In general, software quality can be decomposed hierarchically into characteristics, which can be assessed by various direct and indirect metrics. These characteristics are then combined and aggregated to assess the quality of the software system as a whole. We introduce an approach for quality assessment based on joint distributions of metrics values. Visualizations of these distributions allow users to explore and compare the quality metrics of software systems and their artifacts, and to detect patterns, correlations, and anomalies. Furthermore, it is possible to identify common properties and flaws, as our visualization approach provides rich interactions for visual queries to the quality models’ multivariate data. We evaluate our approach in two use cases based on: 30 real-world technical documentation projects with 20,000 XML documents, and an open source project written in Java with 1000 classes. Our results show that the proposed approach allows an analyst to detect possible causes of bad or good quality.

Place, publisher, year, edition, pages
IEEE, 2018
Keywords
hierarchical data exploration, multivariate data visualization, joint probabilities, t-SNE, data abstraction
National Category
Human Computer Interaction Software Engineering
Research subject
Computer Science, Information and software visualization; Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-78093 (URN)10.1109/VISSOFT.2018.00015 (DOI)2-s2.0-85058463111 (Scopus ID)978-1-5386-8292-0 (ISBN)978-1-5386-8293-7 (ISBN)
Conference
IEEE Working Conference on Software Visualization (VISSOFT), Madrid, Spain, 24-25 September, 2018
Projects
Software technology for self-adaptive systems
Funder
Knowledge Foundation, 20150088
Available from: 2018-10-01 Created: 2018-10-01 Last updated: 2019-08-29Bibliographically approved
Organisations

Search in DiVA

Show all publications

Profile pages

ORCID