lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Text-Independent Speaker ID Employing 2D-CNN for Automatic Video Lecture Categorization in a MOOC Setting
Norwegian University of Science and Technology, Norway.
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0002-0199-2377
Norwegian University of Science and Technology, Norway.
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0003-0512-6350
2019 (English)In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), IEEE Press, 2019, p. 273-277Conference paper, Published paper (Refereed)
Abstract [en]

A new form of distance and blended education has hit the market in recent years with the advent of massive open online courses (MOOCs) which have brought many opportunities to the educational sector. Consequently, the availability of learning content to vast demographics of people and across locations has opened up a plethora of possibilities for everyone to gain new knowledge through MOOCs. This poses an immense issue to the content providers as the amount of manual effort required to structure properly and to organize the content automatically for millions of video lectures daily become incredibly challenging. This paper, therefore, addresses this issue as a small part of our proposed personalized content management system by exploiting the voice pattern of the lecturer for identification and for classifying video lectures to the right speaker category. The use of Mel frequency Cepstral coefficients (MFCC) as 2D input features maps to 2D-CNN has shown promising results in contrast to machine learning and deep learning classifiers - making text-independent speaker identification plausible in MOOC setting for automatic video lecture categorization. It will not only help categorize educational videos efficiently for easy search and retrieval but will also promote effective utilization of micro-lectures and multimedia video learning objects (MLO).

Place, publisher, year, edition, pages
IEEE Press, 2019. p. 273-277
Series
Proceedings-International Conference on Tools With Artificial Intelligence, ISSN 1082-3409, E-ISSN 2375-0197
Keywords [en]
speaker identification;speaker classification;DNN;CNN;eLearning;videos lectures;MOOCs;MFCC
National Category
Computer Sciences
Research subject
Computer and Information Sciences Computer Science, Computer Science
Identifiers
URN: urn:nbn:se:lnu:diva-92079DOI: 10.1109/ICTAI.2019.00046ISI: 000553441500037Scopus ID: 2-s2.0-85081082924ISBN: 978-1-7281-3798-8 (electronic)ISBN: 978-1-7281-3799-5 (print)OAI: oai:DiVA.org:lnu-92079DiVA, id: diva2:1393257
Conference
2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, USA, 4-6 Nov. 2019
Available from: 2020-02-14 Created: 2020-02-14 Last updated: 2021-02-03Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Kastrati, ZenunKurti, Arianit

Search in DiVA

By author/editor
Kastrati, ZenunKurti, Arianit
By organisation
Department of computer science and media technology (CM)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 84 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf