lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Tree Transformations in Inductive Dependency Parsing
Växjö University, Faculty of Mathematics/Science/Technology, School of Mathematics and Systems Engineering.
2007 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

This licentiate thesis deals with automatic syntactic analysis, or parsing, of natural languages. A parser constructs the syntactic analysis, which it learns by looking at correctly analyzed sentences, known as training data. The general topic concerns manipulations of the training data in order to improve the parsing accuracy.

Several studies using constituency-based theories for natural languages in such automatic and data-driven syntactic parsing have shown that training data, annotated according to a linguistic theory, often needs to be adapted in various ways in order to achieve an adequate, automatic analysis. A linguistically sound constituent structure is not necessarily well-suited for learning and parsing using existing data-driven methods. Modifications to the constituency-based trees in the training data, and corresponding modifications to the parser output, have successfully been applied to increase the parser accuracy. The topic of this thesis is to investigate whether similar modifications in the form of tree transformations to training data, annotated with dependency-based structures, can improve accuracy for data-driven dependency parsers. In order to do this, two types of tree transformations are in focus in this thesis.

The first one concerns non-projectivity. The full potential of dependency parsing can only be realized if non-projective constructions are allowed, which pose a problem for projective dependency parsers. On the other hand, non-projective parsers tend, among other things, to be slower. In order to maintain the benefits of projective parsing, a tree transformation technique to recover non-projectivity while using a projective parser is presented here.

The second type of transformation concerns linguistic phenomena that are possible but hard for a parser to learn, given a certain choice of dependency analysis. This study has concentrated on two such phenomena, coordination and verb groups, for which tree transformations are applied in order to improve parsing accuracy, in case the original structure does not coincide with a structure that is easy to learn.

Empirical evaluations are performed using treebank data from various languages, and using more than one dependency parser. The results show that the benefit of these tree transformations used in preprocessing and postprocessing to a large extent is language, treebank and parser independent.

Place, publisher, year, edition, pages
Växjö: Matematiska och systemtekniska institutionen , 2007. , p. 84
Series
Reports from MSI, ISSN 1650-2647 ; 07002
Keywords [en]
Inductive Dependency Parsing, Dependency Structure, Tree Transformation, Non-projectivity, Coordination, Verb Group
National Category
Language Technology (Computational Linguistics)
Research subject
Computer and Information Sciences Computer Science
Identifiers
URN: urn:nbn:se:vxu:diva-1206OAI: oai:DiVA.org:vxu-1206DiVA, id: diva2:204999
Presentation
2007-01-19, D1136, D-byggnaden, Växjö Universitet, Växjö, 13:15 (English)
Opponent
Supervisors
Available from: 2007-03-21 Created: 2007-03-21 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(798 kB)413 downloads
File information
File name FULLTEXT01.pdfFile size 798 kBChecksum MD5
8c59c83c4b754b2352f8499844e0dfabd1bf1cb0d4fce876c078467a3bc4d7fca7d906d9
Type fulltextMimetype application/pdf

Authority records BETA

Nilsson, Jens

Search in DiVA

By author/editor
Nilsson, Jens
By organisation
School of Mathematics and Systems Engineering
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 413 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 270 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf