lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
AI Approaches for Classification and Attribute Extraction in Text
Linnaeus University, Faculty of Technology, Department of Computer Science.
Linnaeus University, Faculty of Technology, Department of Computer Science.
2017 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

As the amount of data online grows, the urge to use this data for different applications grows as well. Machine learning can be used with the intent to reconstruct and validate the data you are interested in. Although the problem is very domain specific, this report will attempt to shed some light on what we call strategies for classification, which in broad terms mean, a set of steps in a process where the end goal is to have classified some part of the original data. As a result, we hope to introduce clarity into the classification process in detail as well as from a broader perspective. The report will investigate two classification objectives, one of which is dependent on many variables found in the input data and one that is more literal and only dependent on one or two variables. Specifically, the data we will classify are sales-objects. Each sales-object has a text describing the object and a related image. We will attempt to place these sales-objects into the correct product category. We will also try to derive the year of creation and it’s dimensions such as height and width. Different approaches are presented in the aforementioned strategies in order to classify such attributes. The results showed that for broader attributes such as a product category, supervised learning is indeed an appropriate approach, while the same can not be said for narrower attributes, which instead had to rely on entity recognition. Experiments on image analytics in conjunction with supervised learning proved image analytics to be a good addition when requiring a higher precision score.

Place, publisher, year, edition, pages
2017.
Keywords [en]
text classification, feature extraction, machine learning, scikit
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:lnu:diva-67882OAI: oai:DiVA.org:lnu-67882DiVA, id: diva2:1139708
External cooperation
Ventilabis AB
Subject / course
Computer Science
Educational program
Software Technology Programme, 180 credits
Presentation
(Swedish)
Supervisors
Examiners
Available from: 2017-09-08 Created: 2017-09-08 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(914 kB)151 downloads
File information
File name FULLTEXT01.pdfFile size 914 kBChecksum SHA-512
cdaf0d80ede78d35752e442fe9a85f3df28b2e016f909ea8452010241d70c2f0f63b8b7fb1a6639e8fad14ba2519a206cd5feebb1b2a9e8a5e6437d92702c5fe
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Magnusson, LudvigRovala, Johan
By organisation
Department of Computer Science
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 151 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 225 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf