lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Converting Dependency Treebanks to MALT-XML
Växjö University, Faculty of Mathematics/Science/Technology, School of Mathematics and Systems Engineering.
Växjö University, Faculty of Mathematics/Science/Technology, School of Mathematics and Systems Engineering.
2005 (English)Report (Other academic)
Abstract [en]

In data-driven approaches to natural language processing, a common problem is the lack of data for many languages. Within the project Stochastic Dependency Grammars for Natural Language Parsing at Växjö University, we (Joakim Nivre, Johan Hall and Jens Nilsson) are developing a deterministic data-driven dependency parser, which is language independent. In this project we intend to enlarge the data resources for our parser. For the moment, we have only tested our parser on small Swedish treebank converted to dependency structure, and on English using Penn Treebank converted to dependency trees. Since we do not have more Swedish dependency treebanks at hand, we want to broaden our view towards treebanks for other languages, especially the bigger ones, to investigate the influence of data size. Primarily, we are focusing on the Danish Dependency Treebank (DDT) and the Prague Dependency Treebank (PDT). These treebanks are not in a format that we can use for our parser and therefore we have to convert them to MALT-XML, a format which our parser can handle.

Place, publisher, year, edition, pages
Växjö: Matematiska och systemtekniska institutionen , 2005. , p. 24
Keywords [en]
Dependency Parsing, Treebank, Dependency Structure
National Category
Language Technology (Computational Linguistics)
Research subject
Computer and Information Sciences Computer Science
Identifiers
URN: urn:nbn:se:vxu:diva-1002ISBN: ISSN 1650-2647 (print)OAI: oai:DiVA.org:vxu-1002DiVA, id: diva2:204778
Available from: 2006-11-22 Created: 2006-11-22 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(180 kB)136 downloads
File information
File name FULLTEXT01.pdfFile size 180 kBChecksum MD5
87ac02defdd9bbb844cfe19e4570e1f9b3f0c7c17ca49383d1a99648934bd4b6960dfebe
Type fulltextMimetype application/pdf

Authority records BETA

Hall, JohanNilsson, Jens

Search in DiVA

By author/editor
Hall, JohanNilsson, Jens
By organisation
School of Mathematics and Systems Engineering
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 136 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 314 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf