Stella Neumann


pdf bib
Linguistic profiles of translation manuscripts and edited translations
Tatiana Serbina | Mario Bisiada | Stella Neumann
Proceedings for the First Workshop on Modelling Translation: Translatology in the Digital Age


pdf bib
L2 Processing Advantages of Multiword Sequences: Evidence from Eye-Tracking
Elma Kerz | Arndt Heilmann | Stella Neumann
Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)

A substantial body of research has demonstrated that native speakers are sensitive to the frequencies of multiword sequences (MWS). Here, we ask whether and to what extent intermediate-advanced L2 speakers of English can also develop the sensitivity to the statistics of MWS. To this end, we aimed to replicate the MWS frequency effects found for adult native language speakers based on evidence from self-paced reading and sentence recall tasks in an ecologically more valid eye-tracking study. L2 speakers’ sensitivity to MWS frequency was evaluated using generalized linear mixed-effects regression with separate models fitted for each of the four dependent measures. Mixed-effects modeling revealed significantly faster processing of sentences containing MWS compared to sentences containing equivalent control items across all eyetracking measures. Taken together, these findings suggest that, in line with emergentist approaches, MWS are important building blocks of language and that similar mechanisms underlie both native and non-native language processing.


pdf bib
CoCoGen - Complexity Contour Generator: Automatic Assessment of Linguistic Complexity Using a Sliding-Window Technique
Ströbel Marcus | Elma Kerz | Daniel Wiechmann | Stella Neumann
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

We present a novel approach to the automatic assessment of text complexity based on a sliding-window technique that tracks the distribution of complexity within a text. Such distribution is captured by what we term “complexity contours” derived from a series of measurements for a given linguistic complexity measure. This approach is implemented in an automatic computational tool, CoCoGen – Complexity Contour Generator, which in its current version supports 32 indices of linguistic complexity. The goal of the paper is twofold: (1) to introduce the design of our computational tool based on a sliding-window technique and (2) to showcase this approach in the area of second language (L2) learning, i.e. more specifically, in the area of L2 writing.

pdf bib
Dynamic pause assessment of keystroke logged data for the detection of complexity in translation and monolingual text production
Arndt Heilmann | Stella Neumann
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

Pause analysis of key-stroke logged translations is a hallmark of process based translation studies. However, an exact definition of what a cognitively effortful pause during the translation process is has not been found yet (Saldanha and O’Brien, 2013). This paper investigates the design of a key-stroke and subject dependent identification system of cognitive effort to track complexity in translation with keystroke logging (cf. also (Dragsted, 2005) (Couto-Vale, in preparation)). It is an elastic measure that takes into account idiosyncratic pause duration of translators as well as further confounds such as bi-gram frequency, letter frequency and some motor tasks involved in writing. The method is compared to a common static threshold of 1000 ms in an analysis of cognitive effort during the translation of grammatical functions from English to German. Additionally, the results are triangulated with eye tracking data for further validation. The findings show that at least for smaller sets of data a dynamic pause assessment may lead to more accurate results than a generic static pause threshold of similar duration.

pdf bib
Automatic Recognition of Linguistic Replacements in Text Series Generated from Keystroke Logs
Daniel Couto-Vale | Stella Neumann | Paula Niemietz
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper introduces a toolkit used for the purpose of detecting replacements of different grammatical and semantic structures in ongoing text production logged as a chronological series of computer interaction events (so-called keystroke logs). The specific case we use involves human translations where replacements can be indicative of translator behaviour that leads to specific features of translations that distinguish them from non-translated texts. The toolkit uses a novel CCG chart parser customised so as to recognise grammatical words independently of space and punctuation boundaries. On the basis of the linguistic analysis, structures in different versions of the target text are compared and classified as potential equivalents of the same source text segment by ‘equivalence judges’. In that way, replacements of grammatical and semantic structures can be detected. Beyond the specific task at hand the approach will also be useful for the analysis of other types of spaceless text such as Twitter hashtags and texts in agglutinative or spaceless languages like Finnish or Chinese.


pdf bib
Part of Speech Annotation of Intermediate Versions in the Keystroke Logged Translation Corpus
Tatiana Serbina | Paula Niemietz | Matthias Fricke | Philipp Meisen | Stella Neumann
Proceedings of the 9th Linguistic Annotation Workshop


pdf bib
Multi-dimensional Annotation and Alignment in an English-German Translation Corpus
Silvia Hansen-Schirra | Stella Neumann | Mihaela Vela
Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing


pdf bib
The MULI Project: Annotation and Analysis of Information Structure in German and English
Stefan Baumann | Caren Brinckmann | Silvia Hansen-Schirra | Geert-Jan Kruijff | Ivana Kruijff-Korbayová | Stella Neumann | Erich Steiner | Elke Teich | Hans Uszkoreit
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Multi-dimensional annotation of linguistic corpora for investigating information structure
Stefan Baumann | Caren Brinckmann | Silvia Hansen-Schirra | Geert-Jan Kruijff | Ivana Kruijff-Korbayová | Stella Neumann | Elke Teich
Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004


pdf bib
Exploitation of an SFL-annotated multilingual register corpus
Stella Neumann
Proceedings of 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at EACL 2003