Wolfgang Seeker - ACL Anthology

Wolfgang Seeker

2020

GRAIN-S: Manually Annotated Syntax for German Interviews
Agnieszka Falenska | Zoltán Czesznak | Kerstin Jung | Moritz Völkel | Wolfgang Seeker | Jonas Kuhn
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present GRAIN-S, a set of manually created syntactic annotations for radio interviews in German. The dataset extends an existing corpus GRAIN and comes with constituency and dependency trees for six interviews. The rare combination of gold- and silver-standard annotation layers coming from GRAIN with high-quality syntax trees can serve as a useful resource for speech- and text-based research. Moreover, since interviews can be put between carefully prepared speech and spontaneous conversational speech, they cover phenomena not seen in traditional newspaper-based treebanks. Therefore, GRAIN-S can contribute to research into techniques for model adaptation and for building more corpus-independent tools. GRAIN-S follows TIGER, one of the established syntactic treebanks of German. We describe the annotation process and discuss decisions necessary to adapt the original TIGER guidelines to the interviews domain. Next, we give details on the conversion from TIGER-style trees to dependency trees. We provide data statistics and demonstrate differences between the new dataset and existing out-of-domain test sets annotated with TIGER syntactic structures. Finally, we provide baseline parsing results for further comparison.

2016

How to Train Dependency Parsers with Inexact Search for Joint Sentence Boundary Detection and Parsing of Entire Documents
Anders Björkelund | Agnieszka Faleńska | Wolfgang Seeker | Jonas Kuhn
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

Stacking or Supertagging for Dependency Parsing – What’s the Difference?
Agnieszka Faleńska | Anders Björkelund | Özlem Çetinoğlu | Wolfgang Seeker
Proceedings of the 14th International Conference on Parsing Technologies

A Graph-based Lattice Dependency Parser for Joint Morphological Segmentation and Syntactic Analysis
Wolfgang Seeker | Özlem Çetinoğlu
Transactions of the Association for Computational Linguistics, Volume 3

Space-delimited words in Turkish and Hebrew text can be further segmented into meaningful units, but syntactic and semantic context is necessary to predict segmentation. At the same time, predicting correct syntactic structures relies on correct segmentation. We present a graph-based lattice dependency parser that operates on morphological lattices to represent different segmentations and morphological analyses for a given input sentence. The lattice parser predicts a dependency tree over a path in the lattice and thus solves the joint task of segmentation, morphological analysis, and syntactic parsing. We conduct experiments on the Turkish and the Hebrew treebank and show that the joint model outperforms three state-of-the-art pipeline systems on both data sets. Our work corroborates findings from constituency lattice parsing for Hebrew and presents the first results for full lattice parsing on Turkish.

2014

Introducing the IMS-Wrocław-Szeged-CIS entry at the SPMRL 2014 Shared Task: Reranking and Morpho-syntax meet Unlabeled Data
Anders Björkelund | Özlem Çetinoğlu | Agnieszka Faleńska | Richárd Farkas | Thomas Mueller | Wolfgang Seeker | Zsolt Szántó
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages

An Out-of-Domain Test Suite for Dependency Parsing of German
Wolfgang Seeker | Jonas Kuhn
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present a dependency conversion of five German test sets from five different genres. The dependency representation is made as similar as possible to the dependency representation of TiGer, one of the two big syntactic treebanks of German. The purpose of these test sets is to enable researchers to test dependency parsing models on several different data sets from different text genres. We discuss some easy to compute statistics to demonstrate the variation and differences in the test sets and provide some baseline experiments where we test the effect of additional lexical knowledge on the out-of-domain performance of two state-of-the-art dependency parsers. Finally, we demonstrate with three small experiments that text normalization may be an important step in the standard processing pipeline when applied in an out-of-domain setting.

Visualization, Search, and Error Analysis for Coreference Annotations
Markus Gärtner | Anders Björkelund | Gregor Thiele | Wolfgang Seeker | Jonas Kuhn
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

A Graphical Interface for Automatic Error Mining in Corpora
Gregor Thiele | Wolfgang Seeker | Markus Gärtner | Anders Björkelund | Jonas Kuhn
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

2013

ICARUS – An Extensible Graphical Search Tool for Dependency Treebanks
Markus Gärtner | Gregor Thiele | Wolfgang Seeker | Anders Björkelund | Jonas Kuhn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations

The Effects of Syntactic Features in Automatic Prediction of Morphology
Wolfgang Seeker | Jonas Kuhn
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

Morphological and Syntactic Case in Statistical Dependency Parsing
Wolfgang Seeker | Jonas Kuhn
Computational Linguistics, Volume 39, Issue 1 - March 2013

(Re)ranking Meets Morphosyntax: State-of-the-art Results from the SPMRL 2013 Shared Task
Anders Björkelund | Özlem Çetinoğlu | Richárd Farkas | Thomas Mueller | Wolfgang Seeker
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages

2012

German nach-Particle Verbs in Semantic Theory and Corpus Data
Boris Haselbach | Wolfgang Seeker | Kerstin Eckart
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper, we present a database-supported corpus study where we combine automatically obtained linguistic information from a statistical dependency parser, namely the occurrence of a dative argument, with predictions from a theory on the argument structure of German particle verbs with """"nach"""". The theory predicts five readings of """"nach"""" which behave differently with respect to dative licensing in their argument structure. From a huge German web corpus, we extracted sentences for a subset of """"nach""""-particle verbs for which no dative is expected by the theory. Making use of a relational database management system, we bring together the corpus sentences and the lemmas manually annotated along the lines of the theory. We validate the theoretical predictions against the syntactic structure of the corpus sentences, which we obtained from a statistical dependency parser. We find that, in principle, the theory is borne out by the data, however, manual error analysis reveals cases for which the theory needs to be extended.

Approximating Theoretical Linguistics Classification in Real Data: the Case of German “nach” Particle Verbs
Boris Haselbach | Kerstin Eckart | Wolfgang Seeker | Kurt Eberle | Ulrich Heid
Proceedings of COLING 2012

Generating Non-Projective Word Order in Statistical Linearization
Bernd Bohnet | Anders Björkelund | Jonas Kuhn | Wolfgang Seeker | Sina Zarriess
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Data-driven Dependency Parsing With Empty Heads
Wolfgang Seeker | Richárd Farkas | Bernd Bohnet | Helmut Schmid | Jonas Kuhn
Proceedings of COLING 2012: Posters

Making Ellipses Explicit in Dependency Conversion for a German Treebank
Wolfgang Seeker | Jonas Kuhn
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a carefully designed dependency conversion of the German phrase-structure treebank TiGer that explicitly represents verb ellipses by introducing empty nodes into the tree. Although the conversion process uses heuristics like many other conversion tools we designed them to fail if no reasonable solution can be found. The failing of the conversion process makes it possible to detect elliptical constructions where the head is missing, but it also allows us to find errors in the original annotation. We discuss the conversion process and the heuristics, and describe some design decisions and error corrections that we applied to the corpus. Since most of today's data-driven dependency parsers are not able to handle empty nodes directly during parsing, our conversion tool also derives a canonical dependency format without empty nodes. It is shown experimentally to be well suited for training statistical dependency parsers by comparing the performance of two parsers from different parsing paradigms on the data set of the CoNLL 2009 Shared Task data and our corpus.

2011

On the Role of Explicit Morphological Feature Representation in Syntactic Dependency Parsing for German
Wolfgang Seeker | Jonas Kuhn
Proceedings of the 12th International Conference on Parsing Technologies

2010

Informed ways of improving data-driven dependency parsing for German
Wolfgang Seeker | Bernd Bohnet | Lilja Øvrelid | Jonas Kuhn
Coling 2010: Posters

Hard Constraints for Grammatical Function Labelling
Wolfgang Seeker | Ines Rehbein | Jonas Kuhn | Josef van Genabith
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics