lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Identifying the Authors' National Variety of English in Social Media Texts
Linnaeus University, Faculty of Technology, Department of Computer Science. Lund University. (ISOVIS)ORCID iD: 0000-0002-8998-3618
XPLAIN, Greece.
Lund University.ORCID iD: 0000-0002-7240-9003
Linnaeus University, Faculty of Technology, Department of Computer Science. (ISOVIS)ORCID iD: 0000-0002-0519-2537
2017 (English)In: Proceedings of the International Conference on Recent Advances in Natural Language Processing, RANLP 2017 / [ed] Galia Angelova, Kalina Bontcheva, Ruslan Mitkov, Ivelina Nikolova, and Irina Temnikova, Association for Computational Linguistics, 2017, 671-678 p.Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

In this paper, we present a study for the identification of authors’ national variety of English in texts from social media. In data from Facebook and Twitter, information about the author’s social profile is annotated, and the national English variety (US, UK, AUS, CAN, NNS) that each author uses is attributed. We tested four feature types: formal linguistic features, POS features, lexicon-based features related to the different varieties, and databased features from each English variety. We used various machine learning algorithms for the classification experiments, and we implemented a feature selection process. The classification accuracy achieved, when the 31 highest ranked features were used, was up to 77.32%. The experimental results are evaluated, and the efficacy of the ranked features discussed.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2017. 671-678 p.
Keyword [en]
NLP, social media texts, national variety, English, annotations, classification
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science, Information and software visualization; Computer and Information Sciences Computer Science, Computer Science
Identifiers
URN: urn:nbn:se:lnu:diva-66856DOI: 10.26615/978-954-452-049-6_086ISBN: 978-954-452-048-9 (print)ISBN: 978-954-452-049-6 (electronic)OAI: oai:DiVA.org:lnu-66856DiVA: diva2:1120992
Conference
The 11th Biennial Conference on Recent Advances In Natural Language Processing (RANLP '17), 2-8 September 2017, Varna, Bulgaria
Projects
StaViCTA
Funder
Swedish Research Council, 2012-5659
Available from: 2017-07-07 Created: 2017-07-07 Last updated: 2017-11-17

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Simaki, VasilikiKerren, Andreas

Search in DiVA

By author/editor
Simaki, VasilikiParadis, CaritaKerren, Andreas
By organisation
Department of Computer Science
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 110 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf