lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Classifying European Court of Human Rights Cases Using Transformer-Based Techniques
Norwegian University of Science and Technology, Norway.
Norwegian University of Science and Technology, Norway.
Linnaeus University, Faculty of Technology, Department of Informatics. Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).ORCID iD: 0000-0002-0199-2377
Sukkur IBA University, Pakistan.
Show others and affiliations
2023 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 11, p. 55664-55676Article in journal (Refereed) Published
Abstract [en]

In the field of text classification, researchers have repeatedly shown the value of transformer-based models such as Bidirectional Encoder Representation from Transformers (BERT) and its variants. Nonetheless, these models are expensive in terms of memory and computational power but have not been utilized to classify long documents of several domains. In addition, transformer models are also often pre-trained on generalized languages, making them less effective in language-specific domains, such as legal documents. In the natural language processing (NLP) domain, there is a growing interest in creating newer models that can handle more complex input sequences and domain-specific languages. Keeping the power of NLP in mind, this study proposes a legal documentation classifier that classifies the legal document by using the sliding window approach to increase the maximum sequence length of the model. We used the ECHR (European Court of Human Rights) publicly available dataset which to a large extent is imbalanced. Therefore, to balance the dataset we have scrapped the case articles from the web and extracted the data. Then, we employed conventional machine learning techniques such as SVM, DT, NB, AdaBoost, and transformer-based neural networks models including BERT, Legal-BERT, RoBERTa, BigBird, ELECTRA, and XLNet for the classification task. The experimental findings show that RoBERTa outperformed all the mentioned BERT versions by obtaining precision, recall, and F1-score of 89.1%, 86.2%, and 86.7%, respectively. While from conventional machine learning techniques, AdaBoost outclasses SVM, DT, and NB by achieving scores of 81.9%, 81.5%, and 81.7% for precision, recall, and F1-score, respectively.

Place, publisher, year, edition, pages
IEEE, 2023. Vol. 11, p. 55664-55676
National Category
Information Systems
Research subject
Computer and Information Sciences Computer Science, Information Systems
Identifiers
URN: urn:nbn:se:lnu:diva-121563DOI: 10.1109/ACCESS.2023.3279034ISI: 001005807300001Scopus ID: 2-s2.0-85161038665OAI: oai:DiVA.org:lnu-121563DiVA, id: diva2:1764679
Available from: 2023-06-08 Created: 2023-06-08 Last updated: 2023-08-08Bibliographically approved

Open Access in DiVA

fulltext(1703 kB)241 downloads
File information
File name FULLTEXT01.pdfFile size 1703 kBChecksum SHA-512
a2617056fe701b30b8d9f07d2a3325c6d92a38659fee8561eb9425c624d932ba5c8698525fffe7518be751c88f0d028edc4048ae2a7d0f194c34472243edeb3b
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Kastrati, Zenun

Search in DiVA

By author/editor
Kastrati, Zenun
By organisation
Department of InformaticsDepartment of computer science and media technology (CM)
In the same journal
IEEE Access
Information Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 241 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 139 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf