lnu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Visualization of Text Duplicates in Documents
Växjö universitet, Fakulteten för matematik/naturvetenskap/teknik, Matematiska och systemtekniska institutionen.
Växjö universitet, Fakulteten för matematik/naturvetenskap/teknik, Matematiska och systemtekniska institutionen.
2009 (engelsk)Independent thesis Advanced level (degree of Master (Two Years)), 15 poäng / 22,5 hpOppgave
Abstract [en]

In this thesis, a tool to visualize duplicate parts in a series of given documents is developed.

Text duplicates are very common nowadays in all fields. This behavior severelyharms the rights of the original authors though it facilitates the work of those whocopy from them. Effective legal measures have been taken when it comes to copyrightissue. An increasing large number of people have paid serious attention to what theywrite when they refer to other people's works. Although references are properly madeby many who admire and respect others' achievements, plagiarism takes place all thetime. Therefore, an intuitive way of visualizing duplicate parts is needed so thatpeople can easily grasp the purpose and decide the legality of those duplicates. Whenit comes to computer science, software clone is very typical phenomenon amongdifferent development groups or even within one group. Since a piece of softwareusually have its hierarchy, it is also interesting to group members when they do aclone detection of their own or other software. For example, if a good overview of thehierarchies is provided in a tree representation, one can easily locate the clones of aparticular node in other trees. More interaction techniques can allow concrete codeaccesses through double clicking on a highlighted node.

To visualize duplicate parts in a nice and intuitive way, a visualization tool isdeveloped for this thesis project. By the time it is done, the following features shouldbe fulfilled. First, the tool can visualize similar or identical parts given a data set.Second, hierarchies of those files can be demonstrated with proper layout. Third, theuser can manipulate the data items on the screen in order to get a better insight of thedata set and help with analysis tasks. Forth, different levels of abstraction areprovided so that the user can either get an overview of all the files or specificallycheck the duplicate parts in the documents of interest.

sted, utgiver, år, opplag, sider
2009. , s. 77
Serie
Rapporter från MSI, ISSN 1650-2647
Emneord [en]
Duplicates, PREFUSE, Visualization, Treemap, Similarity, Interaction
Identifikatorer
URN: urn:nbn:se:vxu:diva-5408ISRN: VXU/MSI/DA/E/--09029/--SEOAI: oai:DiVA.org:vxu-5408DiVA, id: diva2:224192
Presentation
D1169, School of Mathematics and Systems Engineering, D building, Växjö University (engelsk)
Uppsök

Veileder
Examiner
Prosjekter
Visualization of Text Duplicates in DocumentsTilgjengelig fra: 2009-06-17 Laget: 2009-06-17 Sist oppdatert: 2010-03-10bibliografisk kontrollert

Open Access i DiVA

fulltekst(3277 kB)359 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 3277 kBChecksum SHA-512
0346157fbf818e3e9e6e5932d36fb0b8b7b0d778dd2feec0e4d53ac6c9f6b861d541155884b5da91908529d6147a669b2edd358cbaa555697adac8a8a6f31575
Type fulltextMimetype application/pdf

Søk i DiVA

Av forfatter/redaktør
Wang, ChaoPan, Han
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 359 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 395 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf