lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Corpus Categories: What and for whom? When special corpora meet general corpora in comparative studies in literature
Linnaeus University, Faculty of Arts and Humanities, Department of Languages.ORCID iD: 0000-0002-0930-644X
2021 (English)In: Presented at ICAME 42 2021. Online due to COVID-19, 2021Conference paper, Oral presentation only (Other academic)
Abstract [en]

Fiction is inherently messy to work with. Not due to the material itself, but rather due to the field thatsurrounds it and the needs of different theoretical approaches to the reading of the materials. In thedecades since Leech & Short’s Style in Fiction (1981) the intersection between linguistics and literaturehas continued to develop. Mahlberg (2007) has made excellent descriptions of how the relationshipbetween corpus linguistics and literary theory can be approached and the framework needed tosuccessfully make use of our methodological resources within the field of literature. Biber (2011)provides interesting examples of different ways corpora have been used in the study of literature andshows how methods such as keyword analysis, n-grams, and collocations have been at the centerstage. However, the use of large corpora for comparative studies within literature remainsproblematic, as these corpora were rarely constructed for this purpose.Special corpora, defined by Tognini Bonelli (2012) as a corpus where the selection is not made to berepresentative of a language but of a specific use-case, play an important role here as literaturecorpora are often designed to be representative of an author, a period, or a genre. This makes thecategorization procedures very different from the procedures used in general corpora. Whilecategorization in large corpora may make use of broad text type categorizations in combination withtemporal and spatial categorization, as in the BNC and the COHA/COCA, special corpora are ofteninterested in other category tags. The temporal and spatial categorization of texts along with the texttype becomes central for comparative studies of literature, for instance, a single American author offiction active during the 1920s to the 1960s being contrasted with American fiction written during the1920s to 1960s (as in Sundberg & Nilsson forthcoming).As the interpretation of data begins, further questions regarding the categorizations arise, often todo with genre and style, and one must consider whether our American author active during the 1920sto 1960s was a modernist, if they wrote autobiographically, which genre conventions they adhered toand so forth. Comparing this author to any material matching the temporal and spatial categorizationwhile tagged as “fiction” becomes problematic, especially when presenting to an audience who ismainly engaged with those other aspects of the author’s work rather than the "when and where", forinstance at a literary conference (Sundberg 2018, 2019). Categorization within corpora is a wellresearched topic which has produced multiple excellent methods of approach, for instance usingkeywords (Özgür, Özgür & Güngör 2005), named-entity recognition (Sahin et. al 2017) or machinelearning (Fabrizio 2002), but these are of limited use when the desired categorizations are based onfeatures not directly tied to the language itself.As this intersection becomes more popular, the need for a discussion on how these uses of corporaas contrastive, or comparative, resources beyond language variants and variation becomes important.How could this new arena influence our categorization habits, and what are the consequences ofdeeper categorization of fictional texts?

Place, publisher, year, edition, pages
2021.
National Category
Languages and Literature
Identifiers
URN: urn:nbn:se:lnu:diva-118002OAI: oai:DiVA.org:lnu-118002DiVA, id: diva2:1721005
Conference
ICAME 42 2021. Online due to COVID-19
Available from: 2022-12-20 Created: 2022-12-20 Last updated: 2023-10-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Conference website

Authority records

Ihrmark, Daniel

Search in DiVA

By author/editor
Ihrmark, Daniel
By organisation
Department of Languages
Languages and Literature

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 74 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf