lnu.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 20) Show all publications
Skeppstedt, M., Ahltorp, M., Kerren, A., Rzepka, R. & Araki, K. (2019). Application of a topic model visualisation tool to a second language. In: Book of Abstracts of the CLARIN Annual Conference 2019, Leipzig, Germany: . Paper presented at CLARIN Annual Conference 2019, 30 September - 2 October 2019, Leipzig, Germany.
Open this publication in new window or tab >>Application of a topic model visualisation tool to a second language
Show others...
2019 (English)In: Book of Abstracts of the CLARIN Annual Conference 2019, Leipzig, Germany, 2019Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

We explored adaptions required for applying a topic modelling tool to a language that is very different from the one for which the tool was originally developed. The tool, which enables text analysis on the output of topic modelling, was developed for English, and we here applied it on Japanese texts. As white space is not used for indicating word boundaries in Japanese, the texts had to be pre-tokenised and white space inserted to indicate a token segmentation, before the texts could be imported into the tool. The tool was also extended by the addition of word translations and phonetic readings to support users who are second-language speakers of Japanese.

Keywords
Topic Models, Visualization, Japanese, Text Mining, Visual Text Analysis
National Category
Language Technology (Computational Linguistics) Human Computer Interaction
Research subject
Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-87108 (URN)
Conference
CLARIN Annual Conference 2019, 30 September - 2 October 2019, Leipzig, Germany
Projects
DISA-DH
Funder
Swedish Research Council, 2017-00626
Available from: 2019-08-07 Created: 2019-08-07 Last updated: 2019-09-24
Skeppstedt, M., Kerren, A. & Stede, M. (2019). Finding Reasons for Vaccination Hesitancy: Evaluating Semi-Automatic Coding of Internet Discussion Forums. In: Lucila Ohno-Machado and Brigitte Séroussi (Ed.), MEDINFO 2019: Health and Wellbeing e-Networks for All: Proceedings of the 17th World Congress on Medical and Health Informatics. Paper presented at 17th World Congress on Medical and Health Informatics (MEDINFO '19), 25-30 August, 2019, Lyon, France. (pp. 348-352). IOS Press
Open this publication in new window or tab >>Finding Reasons for Vaccination Hesitancy: Evaluating Semi-Automatic Coding of Internet Discussion Forums
2019 (English)In: MEDINFO 2019: Health and Wellbeing e-Networks for All: Proceedings of the 17th World Congress on Medical and Health Informatics / [ed] Lucila Ohno-Machado and Brigitte Séroussi, IOS Press, 2019, p. 348-352Conference paper, Published paper (Refereed)
Abstract [en]

Computer-assisted text coding can facilitate the analysis of large text collections. To evaluate the functionality of providing an analyst with a ranked list of suggestions for suitable text codes, we used a data set of discussion posts, which had been manually coded for reasons given for taking a stance on the topic of vaccination. We trained a logistic regression classifier to rank these reasons according to the probability that they would be present in the post. The approach was evaluated for its ability to include the expected reasons among the n top-ranked reasons, using an n between 1 and 6. The logistic regression-based ranking was more effective than the baseline, which ranked reasons according to their frequency in the training data. To provide such a list of possible codes, ranked by logistic regression, could therefore be a useful feature in a tool for text coding.

Place, publisher, year, edition, pages
IOS Press, 2019
Series
Studies in Health Technology and Informatics, ISSN 0926-9630, E-ISSN 1879-8365 ; 264
Keywords
Vaccination Refusal, Text Mining, Supervised Machine Learning
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-82449 (URN)10.3233/SHTI190241 (DOI)978-1-64368-003-3 (ISBN)978-1-64368-002-6 (ISBN)
Conference
17th World Congress on Medical and Health Informatics (MEDINFO '19), 25-30 August, 2019, Lyon, France.
Projects
Navigating in streams of opinions: Extracting and visualising arguments in opinionated texts
Funder
Swedish Research Council, 2016-06681
Available from: 2019-05-06 Created: 2019-05-06 Last updated: 2019-08-28
Kucher, K., Skeppstedt, M. & Kerren, A. (2018). Application of Interactive Computer-Assisted Argument Extraction to Opinionated Social Media Texts. In: Karsten Klein, Yi-Na Li, and Andreas Kerren (Ed.), Proceedings of the 11th International Symposium on Visual Information Communication and Interaction (VINCI '18): . Paper presented at 11th International Symposium on Visual Information Communication and Interaction (VINCI '18), 13-15 August 2018, Växjö, Sweden (pp. 102-103). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Application of Interactive Computer-Assisted Argument Extraction to Opinionated Social Media Texts
2018 (English)In: Proceedings of the 11th International Symposium on Visual Information Communication and Interaction (VINCI '18) / [ed] Karsten Klein, Yi-Na Li, and Andreas Kerren, Association for Computing Machinery (ACM), 2018, p. 102-103Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

The analysis of various opinions and arguments in textual data can be facilitated by automatic topic modeling methods; however, the exploration and interpretation of the resulting topics and terms may prove to be difficult to the analysts. Opinions, stances, arguments, topics, terms, and text documents are usually connected with many-to-many relationships for such tasks. Exploratory visual analysis with interactive tools can help the analysts to get an overview of the topics and opinions, identify particularly interesting documents, and describe main themes of various arguments. In our previous work, we introduced an interactive tool called Topics2Themes that was used for topic and theme analysis of vaccination-related discussion texts with a limited set of stance categories. In this poster paper, we describe an application of Topics2Themes to a different genre of data, namely, political comments from Reddit, and multiple sentiment and stance categories detected with automatic classifiers.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2018
Keywords
visualization, interaction, topic modeling, argument extraction, text visualization, sentiment analysis, sentiment visualization, stance analysis, stance visualization, annotation
National Category
Computer Sciences Language Technology (Computational Linguistics) Human Computer Interaction
Research subject
Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-75856 (URN)10.1145/3231622.3232505 (DOI)2-s2.0-85055567433 (Scopus ID)978-1-4503-6501-7 (ISBN)
Conference
11th International Symposium on Visual Information Communication and Interaction (VINCI '18), 13-15 August 2018, Växjö, Sweden
Funder
Swedish Research Council, 2016-06681
Available from: 2018-06-13 Created: 2018-06-13 Last updated: 2019-08-29Bibliographically approved
Skeppstedt, M., Stede, M. & Kerren, A. (2018). Stance-Taking in Topics Extracted from Vaccine-Related Tweets and Discussion Forum Posts. In: Graciela Gonzalez-Hernandez, Davy Weissenbacher, Abeed Sarker, and Michael Paul (Ed.), Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop and Shared Task. Paper presented at 3rd Workshop on Workshop on Social Media Mining for Health Applications (SMM4H '18) at EMNLP '18, 31 Oct - 1 Nov, 2018, Brussels, Belgium (pp. 5-8). Association for Computational Linguistics, Article ID W18-5902.
Open this publication in new window or tab >>Stance-Taking in Topics Extracted from Vaccine-Related Tweets and Discussion Forum Posts
2018 (English)In: Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop and Shared Task / [ed] Graciela Gonzalez-Hernandez, Davy Weissenbacher, Abeed Sarker, and Michael Paul, Association for Computational Linguistics, 2018, p. 5-8, article id W18-5902Conference paper, Published paper (Refereed)
Abstract [en]

The occurrence of stance-taking towards vaccination was measured in documents extracted by topic modelling from two different corpora, one discussion forum corpus and one tweet corpus. For some of the topics extracted, their most closely associated documents  contained a proportion of vaccine stance-taking texts that exceeded the corpus average by a large margin. These extracted document sets would, therefore, form a useful resource in a process for computer-assisted analysis of argumentation on the subject of vaccination. 

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2018
Keywords
text analysis, topic modelling, stance
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science, Information and software visualization; Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-77344 (URN)978-1-948087-77-3 (ISBN)
Conference
3rd Workshop on Workshop on Social Media Mining for Health Applications (SMM4H '18) at EMNLP '18, 31 Oct - 1 Nov, 2018, Brussels, Belgium
Projects
Navigating in streams of opinions
Funder
Swedish Research Council, 2016-06681
Available from: 2018-08-27 Created: 2018-08-27 Last updated: 2018-12-11Bibliographically approved
Skeppstedt, M., Kucher, K., Stede, M. & Kerren, A. (2018). Topics2Themes: Computer-Assisted Argument Extraction by Visual Analysis of Important Topics. In: Mennatallah El-Assady, Annette Hautli-Janisz, and Verena Lyding (Ed.), Proceedings of the LREC 2018 Workshop “The 3rd Workshop on Visualization as Added Value in the Development, Use and Evaluation of Language Resources (VisLR III)”: . Paper presented at 3rd Workshop on Visualization as Added Value in the Development, Use and Evaluation of Language Resources (VisLR III) at LREC '18, 12 May, 2018, Miyazaki, Japan (pp. 9-16). Paris, France: European Language Resources Association
Open this publication in new window or tab >>Topics2Themes: Computer-Assisted Argument Extraction by Visual Analysis of Important Topics
2018 (English)In: Proceedings of the LREC 2018 Workshop “The 3rd Workshop on Visualization as Added Value in the Development, Use and Evaluation of Language Resources (VisLR III)” / [ed] Mennatallah El-Assady, Annette Hautli-Janisz, and Verena Lyding, Paris, France: European Language Resources Association, 2018, p. 9-16Conference paper, Published paper (Refereed)
Abstract [en]

While the task of manually extracting arguments from large collections of opinionated text is an intractable one, a tool for computerassisted extraction can (i) select a subset of the text collection that contains re-occurring arguments to minimise the amount of text that the human coder has to read, and (ii) present the selected texts in a way that facilitates manual coding of arguments. We propose a tool called Topics2Themes that uses topic modelling to extract important topics, as well as the terms and texts most closely associated with each topic. We also provide a graphical user interface for manual argument coding, in which the user can search for arguments in the texts selected, create a theme for each type of argument detected and connect it to the texts in which it is found. Topics, terms, texts and themes are displayed as elements in four separate lists, and associations between the elements are visualised through connecting links. It is also possible to focus on one particular element through the sorting functionality provided, which can be used to facilitate the argument coding and gain an overview and understanding of the arguments found in the texts.

Place, publisher, year, edition, pages
Paris, France: European Language Resources Association, 2018
Keywords
argument extraction, topic modelling, text analysis, argument visualization, stance visualization, text visualization, information visualization, interaction
National Category
Language Technology (Computational Linguistics) Computer Sciences
Research subject
Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-70911 (URN)979-10-95546-13-9 (ISBN)
Conference
3rd Workshop on Visualization as Added Value in the Development, Use and Evaluation of Language Resources (VisLR III) at LREC '18, 12 May, 2018, Miyazaki, Japan
Projects
StaViCTA
Funder
Swedish Research Council, 2012-5659Swedish Research Council, 2016-06681
Available from: 2018-02-14 Created: 2018-02-14 Last updated: 2018-10-15Bibliographically approved
Skeppstedt, M. & Ahltorp, M. (2018). Towards a structured evaluation of improv-bots: Improvisational theatre as a non-goal-driven dialogue system. In: CEUR Workshop Proceedings: . Paper presented at 3rd Linguistic and Cognitive Approaches To Dialog Agents Workshop, LaCATODA 2018, 13 July 2018, Stockholm (pp. 37-43). CEUR-WS, 2202
Open this publication in new window or tab >>Towards a structured evaluation of improv-bots: Improvisational theatre as a non-goal-driven dialogue system
2018 (English)In: CEUR Workshop Proceedings, CEUR-WS , 2018, Vol. 2202, p. 37-43Conference paper, Published paper (Refereed)
Abstract [en]

We have here suggested a structured procedure for evaluating artificially produced improvisational theatre dialogue. We have, in addition, provided some examples of dialogues generated within the evaluation framework suggested. Although the end goal of a bot that produces improvisational theatre should be to perform against human actors, we consider the task of having two improv-bots perform against each other as a setting for which it is easier to carry out a reliable evaluation. To better approximate the end goal of having two independent entities that act against each other, we suggest that these two bots should not be allowed to be trained on the same training data. In addition, we suggest the use of the two initial dialogue lines from human-written dialogues as input for the artificially generated scenes, as well as to use the same human-written dialogues in the evaluation procedure for the artificially generated theatre dialogues. © 2018 CEUR-WS. All rights reserved.

Place, publisher, year, edition, pages
CEUR-WS, 2018
Series
CEUR Workshop Proceedings, E-ISSN 1613-0073
Keywords
Artificial intelligence, Botnet, Linguistics, Speech processing, Theaters, Dialogue systems, Evaluation framework, Goal driven, Human actor, Structured evaluation, Training data, Petroleum reservoir evaluation
National Category
Computer Systems
Research subject
Computer Science, Software Technology
Identifiers
urn:nbn:se:lnu:diva-83584 (URN)2-s2.0-85053832197 (Scopus ID)
Conference
3rd Linguistic and Cognitive Approaches To Dialog Agents Workshop, LaCATODA 2018, 13 July 2018, Stockholm
Available from: 2019-05-27 Created: 2019-05-27 Last updated: 2019-06-14Bibliographically approved
Skeppstedt, M., Kerren, A. & Stede, M. (2018). Vaccine Hesitancy in Discussion Forums: Computer-Assisted Argument Mining with Topic Models. In: Adrien Ugon, Daniel Karlsson, Gunnar O. Klein, and Anne Moen (Ed.), Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth. Paper presented at 29th Medical Informatics Europe Conference (MIE '18), April 24-26, 2018, Gothenburg, Sweden (pp. 366-370). IOS Press
Open this publication in new window or tab >>Vaccine Hesitancy in Discussion Forums: Computer-Assisted Argument Mining with Topic Models
2018 (English)In: Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth / [ed] Adrien Ugon, Daniel Karlsson, Gunnar O. Klein, and Anne Moen, IOS Press, 2018, p. 366-370Conference paper, Published paper (Refereed)
Abstract [en]

Arguments used when vaccination is debated on Internet discussion forums might give us valuable insights into reasons behind vaccine hesitancy. In this study, we applied automatic topic modelling on a collection of 943 discussion posts in which vaccine was debated, and six distinct discussion topics were detected by the algorithm. When manually coding the posts ranked as most typical for these six topics, a set of semantically coherent arguments were identified for each extracted topic. This indicates that topic modelling is a useful method for automatically identifying vaccine-related discussion topics and for identifying debate posts where these topics are discussed. This functionality could facilitate manual coding of salient arguments, and thereby form an important component in a system for computer-assisted coding of vaccine-related discussions. 

Place, publisher, year, edition, pages
IOS Press, 2018
Series
Studies in Health Technology and Informatics, ISSN 0926-9630, E-ISSN 1879-8365 ; 247
Keywords
vaccine hesitancy, topic modelling, argument mining
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-70919 (URN)10.3233/978-1-61499-852-5-366 (DOI)2-s2.0-85046551652 (Scopus ID)978-1-61499-851-8 (ISBN)978-1-61499-852-5 (ISBN)
Conference
29th Medical Informatics Europe Conference (MIE '18), April 24-26, 2018, Gothenburg, Sweden
Projects
StaViCTA
Funder
Swedish Research Council, 2016-06681Swedish Research Council, 2012-5659
Available from: 2018-02-15 Created: 2018-02-15 Last updated: 2019-08-29Bibliographically approved
Simaki, V., Paradis, C., Skeppstedt, M., Sahlgren, M., Kucher, K. & Kerren, A. (2017). Annotating speaker stance in discourse: the Brexit Blog Corpus. Corpus linguistics and linguistic theory
Open this publication in new window or tab >>Annotating speaker stance in discourse: the Brexit Blog Corpus
Show others...
2017 (English)In: Corpus linguistics and linguistic theory, ISSN 1613-7027, E-ISSN 1613-7035Article in journal (Refereed) Epub ahead of print
Abstract [en]

The aim of this study is to explore the possibility of identifying speaker stance in discourse, provide an analytical resource for it and an evaluation of the level of agreement across speakers. We also explore to what extent language users agree about what kind of stances are expressed in natural language use or whether their interpretations diverge. In order to perform this task, a comprehensive cognitive-functional framework of ten stance categories was developed based on previous work on speaker stance in the literature. A corpus of opinionated texts was compiled, the Brexit Blog Corpus (BBC). An analytical protocol and interface (ALVA) for the annotations was set up and the data were independently annotated by two annotators. The annotation procedure, the annotation agreements and the co-occurrence of more than one stance in the utterances are described and discussed. The careful, analytical annotation process has returned satisfactory inter- and intra-annotation agreement scores, resulting in a gold standard corpus, the final version of the BBC. 

Keywords
text annotation, blog post texts, modality, evaluation, positioning
National Category
Language Technology (Computational Linguistics) General Language Studies and Linguistics
Research subject
Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-67319 (URN)10.1515/cllt-2016-0060 (DOI)
Projects
StaViCTA
Funder
Swedish Research Council, 2012-5659
Note

TO BE PUBLISHED!

Available from: 2017-08-21 Created: 2017-08-21 Last updated: 2019-08-28
Skeppstedt, M., Kerren, A. & Stede, M. (2017). Automatic detection of stance towards vaccination in online discussion forums. In: Jitendra Jonnagaddala, Hong-Jie Dai, and Yung-Chun Chang (Ed.), Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017): . Paper presented at 1st International Workshop on Digital Disease Detection using Social Media (DDDSM), Taipei, Taiwan, 27 November, 2017 (pp. 1-8). Association for Computational Linguistics
Open this publication in new window or tab >>Automatic detection of stance towards vaccination in online discussion forums
2017 (English)In: Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017) / [ed] Jitendra Jonnagaddala, Hong-Jie Dai, and Yung-Chun Chang, Association for Computational Linguistics, 2017, p. 1-8Conference paper, Published paper (Refereed)
Abstract [en]

A classifier for automatic detection of stance towards vaccination in online forums was trained and evaluated. Debate posts from six discussion threads on the British parental website Mumsnet were manually annotated for stance against or for vaccination, or as undecided. A support vector machine, trained to detect the three classes, achieved a macro F-score of 0.44, while a macro F-score of 0.62 was obtained by the same type of classifier on the binary classification task of distinguishing stance against vaccination from stance for vaccination. These results show that vaccine stance detection in online forums is a difficult task, at least for the type of model investigated and for the relatively small training corpus that was used. Fu- ture work will therefore include an expansion of the training data and an evaluation of other types of classifiers and features. 

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2017
Keywords
stance, online forums, classifier, support vector machine, vaccination
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science, Information and software visualization
Identifiers
urn:nbn:se:lnu:diva-68982 (URN)978-1-948087-07-0 (ISBN)
Conference
1st International Workshop on Digital Disease Detection using Social Media (DDDSM), Taipei, Taiwan, 27 November, 2017
Projects
StaViCTANavigating in streams of opinions
Funder
Swedish Research Council, 2016-06681Swedish Research Council, 2012-5659
Available from: 2017-11-24 Created: 2017-11-24 Last updated: 2018-02-09Bibliographically approved
Skeppstedt, M., Simaki, V., Paradis, C. & Kerren, A. (2017). Detection of Stance and Sentiment Modifiers in Political Blogs. In: Alexey Karpov, Rodmonga Potapova, and Iosif Mporas (Ed.), Speech and Computer: 19th International Conference, SPECOM 2017, Hatfield, UK, September 12-16, 2017, Proceedings. Paper presented at 19th International Conference on Speech and Computer (SPECOM '17), 12-16 September 2017, Hatfield, Hertfordshire, UK (pp. 302-311). Springer International Publishing
Open this publication in new window or tab >>Detection of Stance and Sentiment Modifiers in Political Blogs
2017 (English)In: Speech and Computer: 19th International Conference, SPECOM 2017, Hatfield, UK, September 12-16, 2017, Proceedings / [ed] Alexey Karpov, Rodmonga Potapova, and Iosif Mporas, Springer International Publishing , 2017, p. 302-311Conference paper, Published paper (Refereed)
Abstract [en]

The automatic detection of seven types of modifiers was studied: Certainty, Uncertainty, Hypotheticality, Prediction, Recommendation, Concession/Contrast and Source. A classifier aimed at detecting local cue words that signal the categories was the most successful method for five of the categories. For Prediction and Hypotheticality, however, better results were obtained with a classifier trained on tokens and bi-grams present in the entire sentence. Unsupervised cluster features were shown useful for the categories Source and Uncertainty, when a subset of the training data available was used. However, when all of the 2,095 sentences that had been actively selected and manually annotated were used as training data, the cluster features had a very limited effect. Some of the classification errors made by the models would be possible to avoid by extending the training data set, while other features and feature representations, as well as the incorporation of pragmatic knowledge, would be required for other error types. 

Place, publisher, year, edition, pages
Springer International Publishing, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 10458
Keywords
stance modifiers, sentiment modifiers, active learning, unsupervised features, resource-aware natural language processing
National Category
Language Technology (Computational Linguistics) Computer Sciences
Research subject
Computer Science, Information and software visualization; Computer and Information Sciences Computer Science, Computer Science
Identifiers
urn:nbn:se:lnu:diva-64582 (URN)10.1007/978-3-319-66429-3_29 (DOI)2-s2.0-85029498983 (Scopus ID)978-3-319-66428-6 (ISBN)978-3-319-66429-3 (ISBN)
Conference
19th International Conference on Speech and Computer (SPECOM '17), 12-16 September 2017, Hatfield, Hertfordshire, UK
Projects
StaViCTA
Funder
Swedish Research Council, 2012-5659
Available from: 2017-05-31 Created: 2017-05-31 Last updated: 2019-08-29Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6164-7762

Search in DiVA

Show all publications