lnu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
AI Approaches for Classification and Attribute Extraction in Text
Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
Linnéuniversitetet, Fakulteten för teknik (FTK), Institutionen för datavetenskap (DV).
2017 (Engelska)Självständigt arbete på grundnivå (kandidatexamen), 10 poäng / 15 hpStudentuppsats (Examensarbete)
Abstract [en]

As the amount of data online grows, the urge to use this data for different applications grows as well. Machine learning can be used with the intent to reconstruct and validate the data you are interested in. Although the problem is very domain specific, this report will attempt to shed some light on what we call strategies for classification, which in broad terms mean, a set of steps in a process where the end goal is to have classified some part of the original data. As a result, we hope to introduce clarity into the classification process in detail as well as from a broader perspective. The report will investigate two classification objectives, one of which is dependent on many variables found in the input data and one that is more literal and only dependent on one or two variables. Specifically, the data we will classify are sales-objects. Each sales-object has a text describing the object and a related image. We will attempt to place these sales-objects into the correct product category. We will also try to derive the year of creation and it’s dimensions such as height and width. Different approaches are presented in the aforementioned strategies in order to classify such attributes. The results showed that for broader attributes such as a product category, supervised learning is indeed an appropriate approach, while the same can not be said for narrower attributes, which instead had to rely on entity recognition. Experiments on image analytics in conjunction with supervised learning proved image analytics to be a good addition when requiring a higher precision score.

Ort, förlag, år, upplaga, sidor
2017.
Nyckelord [en]
text classification, feature extraction, machine learning, scikit
Nationell ämneskategori
Programvaruteknik
Identifikatorer
URN: urn:nbn:se:lnu:diva-67882OAI: oai:DiVA.org:lnu-67882DiVA, id: diva2:1139708
Externt samarbete
Ventilabis AB
Ämne / kurs
Datavetenskap
Utbildningsprogram
Programvaruteknik, 180 hp
Presentation
(Svenska)
Handledare
Examinatorer
Tillgänglig från: 2017-09-08 Skapad: 2017-09-08 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

fulltext(914 kB)227 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 914 kBChecksumma SHA-512
cdaf0d80ede78d35752e442fe9a85f3df28b2e016f909ea8452010241d70c2f0f63b8b7fb1a6639e8fad14ba2519a206cd5feebb1b2a9e8a5e6437d92702c5fe
Typ fulltextMimetyp application/pdf

Sök vidare i DiVA

Av författaren/redaktören
Magnusson, LudvigRovala, Johan
Av organisationen
Institutionen för datavetenskap (DV)
Programvaruteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 227 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 399 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf