lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optimized Machine Learning Input for Evolutionary Source Code to Architecture Mapping
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0003-1154-5308
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0003-1173-5187
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0002-0835-823X
2023 (English)In: Software Architecture. ECSA 2022 Tracks and Workshops. ECSA 2022 / [ed] Batista, T., Bureš, T., Raibulet, C., Muccini, H., Springer, 2023, p. 421-435Conference paper, Published paper (Refereed)
Abstract [en]

Automatically mapping source code to architectural modules is an interesting and difficult problem. Mapping can be considered a classification problem, and machine learning approaches have been used to automatically generate mappings. Feature engineering is an essential element of machine learning. We study which source code features are important for an algorithm to function effectively. Additionally, we examine stemming and data cleaning. We systematically evaluate various combinations of features on five datasets created from JabRef, TeamMates, ProM, and two Hadoop subsystems. The systems are open-source with well-established mappings. We find that no single set of features consistently provides the highest performance, and even the subsystems of Hadoop have varied optimal feature combinations. Stemming provided minimal benefit, and cleaning the data is not worth the effort, as it also provided minimal benefit.

Place, publisher, year, edition, pages
Springer, 2023. p. 421-435
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 13928
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
URN: urn:nbn:se:lnu:diva-126305DOI: 10.1007/978-3-031-36889-9_28Scopus ID: 2-s2.0-85186769388ISBN: 9783031368882 (print)ISBN: 9783031368899 (electronic)OAI: oai:DiVA.org:lnu-126305DiVA, id: diva2:1825604
Conference
Software Architecture. ECSA 2022 Tracks and Workshops Prague, Czech Republic, September 19–23, 2022
Available from: 2024-01-09 Created: 2024-01-09 Last updated: 2024-03-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Olsson, TobiasEricsson, MorganWingkvist, Anna

Search in DiVA

By author/editor
Olsson, TobiasEricsson, MorganWingkvist, Anna
By organisation
Department of computer science and media technology (CM)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 70 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf