Open this publication in new window or tab >>2021 (English)In: Proceedings of the 2021 Swedish Workshop on Data Science (SweDS) / [ed] Rafael M. Martins, Morgan Ericsson, Danny Weyns, Kostiantyn Kucher, IEEE, 2021, p. 1-8Conference paper, Published paper (Refereed)
Abstract [en]
In this paper, we present our methodology for supervised stance classification of sparse and imbalanced social media data. We test our framework on a manually labeled dataset of 5700 messages about immigration in the Swedish language posted on the Flashback forum, a controversial online discussion platform. Our proposed approach currently achieves a macro- averaged F1-score of 0.72 for test data on a two-class problem compared against 0.27 for a baseline four-class model. Since effective classification of imbalanced and sparse textual data in under-resourced languages presents certain methodological challenges, our study contributes to a discussion on the best pathways to achieve highest model performance given the character of the data and unavailability of large training datasets for this task. Moreover, this work exemplifies the application of ML methodology to social media data, which can be particularly relevant for social scientists working in this area and interested in leveraging the possibilities of machine learning in their research field. This methodology and the obtained results provide a foundation for further in-depth analyses of social media texts in the Swedish language following a data-driven approach.
Place, publisher, year, edition, pages
IEEE, 2021
Keywords
social media, sentiment classification, stance classification, supervised learning, Swedish text data classification
National Category
Natural Language Processing Peace and Conflict Studies Other Social Sciences not elsewhere specified
Research subject
Social Sciences; Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-108362 (URN)10.1109/SweDS53855.2021.9637718 (DOI)000833296400001 ()2-s2.0-85123826996 (Scopus ID)9781665418300 (ISBN)
Conference
2021 Swedish Workshop on Data Science (SweDS), Växjö, Sweden, December 2-3, 2021
Projects
DISA
2021-12-032021-12-032025-02-20Bibliographically approved