lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Mining Git Repositories: An introduction to repository mining
Linnaeus University, Faculty of Technology, Department of Computer Science.
2013 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

When performing an analysis of the evolution of software quality and software metrics,there is a need to get access to as many versions of the source code as possible. There isa lack of research on how data or source code can be extracted from the source controlmanagement system Git. This thesis explores different possibilities to resolve thisproblem.

Lately, there has been a boom in usage of the version control system Git. Githubalone hosts about 6,100,000 projects. Some well known projects and organizations thatuse Git are Linux, WordPress, and Facebook. Even with these figures and clients, thereare very few tools able to perform data extraction from Git repositories. A pre-studyshowed that there is a lack of standardization on how to share mining results, and themethods used to obtain them.

There are several tools available for older version control systems, such as concurrentversions system (CVS), but few for Git. The examined repository mining applicationsfor Git are either poorly documented; or were built to be very purpose-specific to theproject for which they were designed.

This thesis compiles a list of general issues encountered when using repositorymining as a tool for data gathering. A selection of existing repository mining tools wereevaluated towards a set of prerequisite criteria. The end result of this evaluation is thecreation of a new repository mining tool called Doris. This tool also includes a smallcode metrics analysis library to show how it can be extended.

Place, publisher, year, edition, pages
2013. , p. 28
Keyword [en]
repository mining, msr, git, quality analysis, version control system, vcs, source control management, scm, data mining, data extraction
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:lnu:diva-27742OAI: oai:DiVA.org:lnu-27742DiVA, id: diva2:638844
Subject / course
Computer Science
Presentation
2013-06-03, 18:35 (English)
Supervisors
Examiners
Available from: 2013-08-12 Created: 2013-08-02 Last updated: 2018-01-11Bibliographically approved

Open Access in DiVA

Mining Git Repositories(859 kB)2098 downloads
File information
File name FULLTEXT01.pdfFile size 859 kBChecksum SHA-512
9fbb34f0d764d7e85fad4df29455a9e1b92ff23b05a86653034d5f5931bf3ec1be8b62d44abaa5ca63bc4e57c3afd5720ab4b0c4c53c4b80065318844509a985
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Carlsson, Emil
By organisation
Department of Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2098 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1011 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf