lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Mapping Source Code to Software Architecture by Leveraging Large Language Models
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). Volvo Group, Sweden.ORCID iD: 0000-0002-3047-4296
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0001-6981-0966
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0003-1154-5308
2024 (English)In: Software Architecture: ECSA 2024 Tracks and Workshops, Springer Nature, 2024, Vol. 14937, p. 133-149Conference paper, Published paper (Refereed)
Abstract [en]

Architecture refactoring is a big challenge and requires thorough analysis and labor-intensive, error-prone activities to restructure functionalities from a legacy architecture to a new intended one. Indeed, source code should be adapted to match the new structure. In this context, automatically mapping source code to the intended architecture would significantly reduce manual work and prevent technical debt. To this end, in this paper, we aim to map methods to architectural modules solely defined by textual descriptions, i.e., formulated as a machine learning text classification problem. Methods are mapped into modules using different approaches. We apply the proposed approach to an open-source software system, results show that vectorizing text and code using large language models outperforms other modern methods. The different applied machine learning classifiers perform comparably well, where the best attain accuracy of around 40% and F1-score of around 30%.

Place, publisher, year, edition, pages
Springer Nature, 2024. Vol. 14937, p. 133-149
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords [en]
large language models, machine learning, software architecture, software refactoring, source code mapping to architecture
National Category
Software Engineering
Research subject
Computer Science, Software Technology
Identifiers
URN: urn:nbn:se:lnu:diva-138378DOI: 10.1007/978-3-031-71246-3_13Scopus ID: 2-s2.0-85204359733OAI: oai:DiVA.org:lnu-138378DiVA, id: diva2:1956836
Conference
18th European Conference on Software Architecture, Luxembourg City, Luxembourg, 3 – 6 September, 2024
Available from: 2025-05-07 Created: 2025-05-07 Last updated: 2025-05-19Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Johansson, NilsCaporuscio, MauroOlsson, Tobias

Search in DiVA

By author/editor
Johansson, NilsCaporuscio, MauroOlsson, Tobias
By organisation
Department of computer science and media technology (CM)
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 10 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf