lnu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Tree Transformations in Inductive Dependency Parsing
Växjö universitet, Fakulteten för matematik/naturvetenskap/teknik, Matematiska och systemtekniska institutionen.
2007 (Engelska)Licentiatavhandling, monografi (Övrigt vetenskapligt)
Abstract [en]

This licentiate thesis deals with automatic syntactic analysis, or parsing, of natural languages. A parser constructs the syntactic analysis, which it learns by looking at correctly analyzed sentences, known as training data. The general topic concerns manipulations of the training data in order to improve the parsing accuracy.

Several studies using constituency-based theories for natural languages in such automatic and data-driven syntactic parsing have shown that training data, annotated according to a linguistic theory, often needs to be adapted in various ways in order to achieve an adequate, automatic analysis. A linguistically sound constituent structure is not necessarily well-suited for learning and parsing using existing data-driven methods. Modifications to the constituency-based trees in the training data, and corresponding modifications to the parser output, have successfully been applied to increase the parser accuracy. The topic of this thesis is to investigate whether similar modifications in the form of tree transformations to training data, annotated with dependency-based structures, can improve accuracy for data-driven dependency parsers. In order to do this, two types of tree transformations are in focus in this thesis.

The first one concerns non-projectivity. The full potential of dependency parsing can only be realized if non-projective constructions are allowed, which pose a problem for projective dependency parsers. On the other hand, non-projective parsers tend, among other things, to be slower. In order to maintain the benefits of projective parsing, a tree transformation technique to recover non-projectivity while using a projective parser is presented here.

The second type of transformation concerns linguistic phenomena that are possible but hard for a parser to learn, given a certain choice of dependency analysis. This study has concentrated on two such phenomena, coordination and verb groups, for which tree transformations are applied in order to improve parsing accuracy, in case the original structure does not coincide with a structure that is easy to learn.

Empirical evaluations are performed using treebank data from various languages, and using more than one dependency parser. The results show that the benefit of these tree transformations used in preprocessing and postprocessing to a large extent is language, treebank and parser independent.

Ort, förlag, år, upplaga, sidor
Växjö: Matematiska och systemtekniska institutionen , 2007. , s. 84
Serie
Rapporter från MSI, ISSN 1650-2647 ; 07002
Nyckelord [en]
Inductive Dependency Parsing, Dependency Structure, Tree Transformation, Non-projectivity, Coordination, Verb Group
Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Forskningsämne
Data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:vxu:diva-1206OAI: oai:DiVA.org:vxu-1206DiVA, id: diva2:204999
Presentation
2007-01-19, D1136, D-byggnaden, Växjö Universitet, Växjö, 13:15 (Engelska)
Opponent
Handledare
Tillgänglig från: 2007-03-21 Skapad: 2007-03-21 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

fulltext(798 kB)413 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 798 kBChecksumma MD5
8c59c83c4b754b2352f8499844e0dfabd1bf1cb0d4fce876c078467a3bc4d7fca7d906d9
Typ fulltextMimetyp application/pdf

Personposter BETA

Nilsson, Jens

Sök vidare i DiVA

Av författaren/redaktören
Nilsson, Jens
Av organisationen
Matematiska och systemtekniska institutionen
Språkteknologi (språkvetenskaplig databehandling)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 413 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 270 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf