lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Utilizing Multilingual Language Data in (Nearly) Real Time: The Case of the Nordic Tweet Stream
Linnaeus University, Faculty of Arts and Humanities, Department of Languages. Univ Eastern Finland, Finland.ORCID iD: 0000-0003-3123-6932
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM), Department of Computer Science.ORCID iD: 0000-0001-9775-4594
Linnaeus University, Faculty of Arts and Humanities, Department of Languages.ORCID iD: 0000-0002-5613-7618
Linnaeus University, Faculty of Arts and Humanities, Department of Languages.ORCID iD: 0000-0002-5985-6183
2017 (English)In: Journal of universal computer science (Online), ISSN 0948-695X, E-ISSN 0948-6968, Vol. 23, no 11, p. 1038-1056Article in journal (Refereed) Published
Abstract [en]

This paper presents the Nordic Tweet Stream, a cross-disciplinary digital humanities project that downloads Twitter messages from Denmark, Finland, Iceland, Norway and Sweden. The paper first introduces some of the technical aspects in creating a real-time monitor corpus that grows every day, and then two case studies illustrate how the corpus could be used as empirical evidence in studies focusing on the global spread of English. Our approach in the case studies is sociolinguistic, and we are interested in how widespread multilingualism which involves English is in the region, and what happens to ongoing grammatical change in digital environments. The results are based on 6.6 million tweets collected during the first four months of data streaming. They show that English was the most frequently used language, accounting for almost a third. This indicates that Nordic Twitter users choose English as a means of reaching wider audiences. The preference for English is the strongest in Denmark and the weakest in Finland. Tweeting mostly occurs late in the evening, and high-profile media events such as the Eurovision Song Contest produce considerable peaks in Twitter activity. The prevalent use of informal features such as univerbated verb forms (e.g., gotta for (HAVE) got to) supports previous findings of the speech-like nature of written Twitter data, but the results indicate that tweeters are pushing the limits even further.

Place, publisher, year, edition, pages
2017. Vol. 23, no 11, p. 1038-1056
Keywords [en]
Twitter, corpus linguistics, language choice, oral discourse style
National Category
Computer and Information Sciences
Research subject
Computer and Information Sciences Computer Science
Identifiers
URN: urn:nbn:se:lnu:diva-73133ISI: 000429070900004OAI: oai:DiVA.org:lnu-73133DiVA, id: diva2:1199479
Available from: 2018-04-20 Created: 2018-04-20 Last updated: 2018-05-17Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records BETA

Laitinen, MikkoLundberg, JonasLevin, MagnusLakaw, Alexander

Search in DiVA

By author/editor
Laitinen, MikkoLundberg, JonasLevin, MagnusLakaw, Alexander
By organisation
Department of LanguagesDepartment of Computer Science
In the same journal
Journal of universal computer science (Online)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 85 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf