lnu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Importance of HTML structural elements and metadata in automated subject classification
Lunds universitet.ORCID-id: 0000-0003-4169-4777
Lunds universitet.
2005 (Engelska)Ingår i: Research and advanced technology for digital libraries / [ed] Andreas Rauber, Stavros Christodoulakis, A Min Tjoa, Springer, 2005, s. 368-378Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

The aim of the study was to determine how significance indicators assigned to different Web page elements (internal metadata, title, headings, and main text) influence automated classification. The data collection that was used comprised 1000 Web pages in engineering, to which Engineering Information classes had been manually assigned. The significance indicators were derived using several different methods: (total and partial) precision and recall, semantic distance and multiple regression. It was shown that for best results all the elements have to be included in the classification process. The exact way of combining the significance indicators turned out not to be overly important: using the F1 measure, the best combination of significance indicators yielded no more than 3% higher performance results than the baseline.

Ort, förlag, år, upplaga, sidor
Springer, 2005. s. 368-378
Serie
Lecture Notes in Computer Science, ISSN 0302-9743 ; 3652
Nationell ämneskategori
Biblioteks- och informationsvetenskap
Forskningsämne
Humaniora, Biblioteks- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:lnu:diva-37071DOI: 10.1007/11551362_33ISBN: 978-3-540-28767-4 (tryckt)ISBN: 978-3-540-31931-3 (tryckt)OAI: oai:DiVA.org:lnu-37071DiVA, id: diva2:747760
Konferens
Research and Advanced Technology for Digital Libraries, Proceedings of ECDL 2005 – the 9th European Conference on Research and Advanced Technology for Digital Libraries, Vienna, Austria, September 18-23, 2005
Tillgänglig från: 2014-09-17 Skapad: 2014-09-17 Senast uppdaterad: 2015-09-30Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Person

Golub, Koraljka

Sök vidare i DiVA

Av författaren/redaktören
Golub, Koraljka
Biblioteks- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 159 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf