lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Supervised Ontology-Based Document Classification Model
Norwegian University of Science and Technology, Norway.ORCID iD: 0000-0002-0199-2377
Norwegian University of Science and Technology, Norway.
2017 (English)In: Proceedings of the International Conference on Compute and Data Analysis, ICCDA'17, ACM Publications, 2017, p. 245-251Conference paper, Published paper (Refereed)
Abstract [en]

Ontology-based document classification relies on background knowledge exploited by ontologies to represent documents. Background knowledge is embedded in a document using the exact matching technique. The basic idea of this technique is to map a term to a concept by searching only the concept labels that explicitly occur in a document. Searching only the presence of concept labels limits the capabilities to capture and exploit the whole conceptualization involved in user information and content meanings. Therefore, to address this limitation, we propose a new document classification model based on ontologies. The proposed model uses background knowledge derived by ontologies for document representation. It associates a document with a set of concepts by not only using the exact matching technique but also by identifying and extracting new terms which can be semantically related to the concepts of ontologies. Additionally, the proposed model employs a new concept weighting technique which computes the weight of a concept using the relevance and the importance of the concept. We conducted several experiments using a real ontology and a dataset to test our proposed model. The results obtained by experiments run on 3 different classification algorithms using the baseline ontology, the improved concept vector space model by using the new concept weighting technique, and the enriched ontology, show that our proposed model achieved a considerable improvement of classification performance.

Place, publisher, year, edition, pages
ACM Publications, 2017. p. 245-251
Keywords [en]
Document classification, First Sense Heuristic, Maximizing Semantic Similarity, Ontology, SEMCON, iCVS
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Information Systems
Identifiers
URN: urn:nbn:se:lnu:diva-88767DOI: 10.1145/3093241.3107883ISBN: 978-1-4503-5241-3 (print)OAI: oai:DiVA.org:lnu-88767DiVA, id: diva2:1346434
Conference
International Conference on Compute and Data Analysis - ICCDA, May 19 - 23, 2017, Lakeland, FL, USA
Available from: 2019-08-27 Created: 2019-08-27 Last updated: 2019-09-06Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full texthttp://doi.acm.org/10.1145/3093241.3107883

Authority records BETA

Kastrati, Zenun

Search in DiVA

By author/editor
Kastrati, Zenun
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf