lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Academic Vocabulary in Wikipedia Articles: Frequency and Dispersion in Uneven Datasets
University of Helsinki, Finland.
Linnaeus University, Faculty of Arts and Humanities, Department of Languages. University of Helsinki, Finland.
2019 (English)In: From Data to Evidence in English Language Research / [ed] Carl Suhr, Terttu Nevalainen, Irma Taavitsainen, Leiden: Brill Academic Publishers, 2019, p. 282-306Chapter in book (Refereed)
Abstract [en]

Despite its popularity, the status of Wikipedia in higher education settings remains somewhat controversial, and the linguistic characteristics of the genre have not been exhaustively described. This exploratory paper takes a data-driven approach to assessing the use of academic vocabulary in Wikipedia articles. Our analysis is based on Coxhead’s Academic Word List, and the data comes from the Westbury Lab Wikipedia Corpus. We employ methods of statistical data analysis to classify Wikipedia articles according to the frequencies of academic words, and apply the same procedure to a comparable set of texts representing another genre, published research articles. The unsupervised classification procedure groups the articles according to academic content regardless of topic, which allows us to measure genre-specific similarities. The findings of the study show that academic words are common in both genres in focus, and more interestingly, if we look at aggregate frequencies of academic words, Wikipedia articles are not markedly different from RAs within the same discipline. This being said, we can observe disciplinary differences in the distribution of academic words in Wikipedia, such that Economics writing contains more academic words than the other two disciplines in focus. Disciplinary differences can likewise be observed in the distribution of individual academic words.

Place, publisher, year, edition, pages
Leiden: Brill Academic Publishers, 2019. p. 282-306
Series
Language and Computers, ISSN 0921-5034 ; 83
Keywords [en]
wikipedia, corpus linguistics, dispersion, statistics
National Category
Specific Languages
Research subject
Humanities, English
Identifiers
URN: urn:nbn:se:lnu:diva-79630DOI: 10.1163/9789004390652_013ISBN: 978-90-04-39065-2 (electronic)ISBN: 978-90-04-39064-5 (print)OAI: oai:DiVA.org:lnu-79630DiVA, id: diva2:1280333
Available from: 2019-01-18 Created: 2019-01-18 Last updated: 2019-02-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records BETA

Tyrkkö, Jukka

Search in DiVA

By author/editor
Tyrkkö, Jukka
By organisation
Department of Languages
Specific Languages

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 106 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf