lnu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Supervised Ontology-Based Document Classification Model
Norwegian University of Science and Technology, Norway.ORCID-id: 0000-0002-0199-2377
Norwegian University of Science and Technology, Norway.
2017 (Engelska)Ingår i: Proceedings of the International Conference on Compute and Data Analysis, ICCDA'17, ACM Publications, 2017, s. 245-251Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Ontology-based document classification relies on background knowledge exploited by ontologies to represent documents. Background knowledge is embedded in a document using the exact matching technique. The basic idea of this technique is to map a term to a concept by searching only the concept labels that explicitly occur in a document. Searching only the presence of concept labels limits the capabilities to capture and exploit the whole conceptualization involved in user information and content meanings. Therefore, to address this limitation, we propose a new document classification model based on ontologies. The proposed model uses background knowledge derived by ontologies for document representation. It associates a document with a set of concepts by not only using the exact matching technique but also by identifying and extracting new terms which can be semantically related to the concepts of ontologies. Additionally, the proposed model employs a new concept weighting technique which computes the weight of a concept using the relevance and the importance of the concept. We conducted several experiments using a real ontology and a dataset to test our proposed model. The results obtained by experiments run on 3 different classification algorithms using the baseline ontology, the improved concept vector space model by using the new concept weighting technique, and the enriched ontology, show that our proposed model achieved a considerable improvement of classification performance.

Ort, förlag, år, upplaga, sidor
ACM Publications, 2017. s. 245-251
Nyckelord [en]
Document classification, First Sense Heuristic, Maximizing Semantic Similarity, Ontology, SEMCON, iCVS
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Data- och informationsvetenskap, Informatik
Identifikatorer
URN: urn:nbn:se:lnu:diva-88767DOI: 10.1145/3093241.3107883ISBN: 978-1-4503-5241-3 (tryckt)OAI: oai:DiVA.org:lnu-88767DiVA, id: diva2:1346434
Konferens
International Conference on Compute and Data Analysis - ICCDA, May 19 - 23, 2017, Lakeland, FL, USA
Tillgänglig från: 2019-08-27 Skapad: 2019-08-27 Senast uppdaterad: 2019-09-06Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltexthttp://doi.acm.org/10.1145/3093241.3107883

Person

Kastrati, Zenun

Sök vidare i DiVA

Av författaren/redaktören
Kastrati, Zenun
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 145 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf