2017
pdf
bib
abs
Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs
Richard Eckart de Castilho
|
Nancy Ide
|
Emanuele Lapponi
|
Stephan Oepen
|
Keith Suderman
|
Erik Velldal
|
Marc Verhagen
Proceedings of the 11th Linguistic Annotation Workshop
For decades, most self-respecting linguistic engineering initiatives have designed and implemented custom representations for various layers of, for example, morphological, syntactic, and semantic analysis. Despite occasional efforts at harmonization or even standardization, our field today is blessed with a multitude of ways of encoding and exchanging linguistic annotations of these types, both at the levels of ‘abstract syntax’, naming choices, and of course file formats. To a large degree, it is possible to work within and across design plurality by conversion, and often there may be good reasons for divergent design reflecting differences in use. However, it is likely that some abstract commonalities across choices of representation are obscured by more superficial differences, and conversely there is no obvious procedure to tease apart what actually constitute contentful vs. mere technical divergences. In this study, we seek to conceptually align three representations for common types of morpho-syntactic analysis, pinpoint what in our view constitute contentful differences, and reflect on the underlying principles and specific requirements that led to individual choices. We expect that a more in-depth understanding of these choices across designs may led to increased harmonization, or at least to more informed design of future representations.
2014
pdf
bib
abs
Off-Road LAF: Encoding and Processing Annotations in NLP Workflows
Emanuele Lapponi
|
Erik Velldal
|
Stephan Oepen
|
Rune Lain Knudsen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
The Linguistic Annotation Framework (LAF) provides an abstract data model for specifying interchange representations to ensure interoperability among different annotation formats. This paper describes an ongoing effort to adapt the LAF data model as the interchange representation in complex workflows as used in the Language Analysis Portal (LAP), an on-line and large-scale processing service that is developed as part of the Norwegian branch of the Common Language Resources and Technology Infrastructure (CLARIN) initiative. Unlike several related on-line processing environments, which predominantly instantiate a distributed architecture of web services, LAP achives scalability to potentially very large data volumes through integration with the Norwegian national e-Infrastructure, and in particular job sumission to a capacity compute cluster. This setup leads to tighter integration requirements and also calls for efficient, low-overhead communication of (intermediate) processing results with workflows. We meet these demands by coupling the LAF data model with a lean, non-redundant JSON-based interchange format and integration of an agile and performant NoSQL database, allowing parallel access from cluster nodes, as the central repository of linguistic annotation.
pdf
bib
Predicting Party Affiliations from European Parliament Debates
Bjørn Høyland
|
Jean-François Godbout
|
Emanuele Lapponi
|
Erik Velldal
Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science
2013
pdf
bib
Down-stream effects of tree-to-dependency conversions
Jakob Elming
|
Anders Johannsen
|
Sigrid Klerke
|
Emanuele Lapponi
|
Hector Martinez Alonso
|
Anders Søgaard
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
pdf
bib
HPC-ready Language Analysis for Human Beings
Emanuele Lapponi
|
Erik Velldal
|
Nikolay A. Vazov
|
Stephan Oepen
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013)
2012
pdf
bib
UiO 2: Sequence-labeling Negation Using Dependency Features
Emanuele Lapponi
|
Erik Velldal
|
Lilja Øvrelid
|
Jonathon Read
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)