lnu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Automated subject classification of textual documents in the context of Web-based hierarchical browsing
University of Bath, UK. (Library and Information Science)ORCID-id: 0000-0003-4169-4777
2011 (Engelska)Ingår i: Knowledge organization, ISSN 0943-7444, Vol. 38, nr 3, s. 230-244Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

While automated methods for information organization have been around for several decades now, exponential growth of the World Wide Web has put them into the forefront of research in different communities, within which several approaches can be identified: 1) machine learning (algorithms that allow computers to improve their performance based on learning from pre-existing data); 2) document clustering (algorithms for unsupervised document organization and automated topic extraction); and 3) string matching (algorithms that match given strings within larger text). Here the aim was to automatically organize textual documents into hierarchical structures for subject browsing. The string-matching approach was tested using a controlled vocabulary (containing pre-selected and pre-defined authorized terms, each corresponding to only one concept). The results imply that an appropriate controlled vocabulary, with a sufficient number of entry terms designating classes, could in itself be a solution for automated classification. Then, if the same controlled vocabulary had an appropriate hierarchical structure, it would at the same time provide a good browsing structure for the collection of automatically classified documents.

Ort, förlag, år, upplaga, sidor
Ergon-Verlag, 2011. Vol. 38, nr 3, s. 230-244
Nationell ämneskategori
Biblioteks- och informationsvetenskap
Forskningsämne
Humaniora, Biblioteks- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:lnu:diva-37057Scopus ID: 2-s2.0-79960942208OAI: oai:DiVA.org:lnu-37057DiVA, id: diva2:747709
Tillgänglig från: 2014-09-17 Skapad: 2014-09-17 Senast uppdaterad: 2019-08-29Bibliografiskt granskad

Open Access i DiVA

fulltext(345 kB)218 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 345 kBChecksumma SHA-512
c03de82803cf4222ee324cfb615b37e1650adc1d867d00864794f852f7f375df395b196c4749ce450d3a79a9d145f106c5d928adb09590f2d87cdc5b4c4f2a63
Typ fulltextMimetyp application/pdf

Scopus

Personposter BETA

Golub, Koraljka

Sök vidare i DiVA

Av författaren/redaktören
Golub, Koraljka
I samma tidskrift
Knowledge organization
Biblioteks- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 218 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 616 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf