Detmar Meurers - ACL Anthology

Detmar Meurers

Also published as: Walt Detmar Meurers, W. Detmar Meurers

2025

Automatic concept extraction for learning domain modeling: A weakly supervised approach using contextualized word embeddings
Kordula De Kuthy | Leander Girrbach | Detmar Meurers
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

Heterogeneity in student populations poses achallenge in formal education, with adaptivetextbooks offering a potential solution by tai-loring content based on individual learner mod-els. However, creating domain models for text-books typically demands significant manual ef-fort. Recent work by Chau et al. (2021) demon-strated automated concept extraction from dig-ital textbooks, but relied on costly domain-specific manual annotations. This paper in-troduces a novel, scalable method that mini-mizes manual effort by combining contextu-alized word embeddings with weakly super-vised machine learning. Our approach clustersword embeddings from textbooks and identi-fies domain-specific concepts using a machinelearner trained on concept seeds automaticallyextracted from Wikipedia. We evaluate thismethod using 28 economics textbooks, com-paring its performance against a tf-idf baseline,a supervised machine learning baseline, theRAKE keyword extraction method, and humandomain experts. Results demonstrate that ourweakly supervised method effectively balancesaccuracy with reduced annotation effort, offer-ing a practical solution for automated conceptextraction in adaptive learning environments.

Grammar Control in Dialogue Response Generation for Language Learning Chatbots
Dominik Glandorf | Peng Cui | Detmar Meurers | Mrinmaya Sachan
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Chatbots based on large language models offer cheap conversation practice opportunities for language learners. However, they are hard to control for linguistic forms that correspond to learners’ current needs, such as grammar. We control grammar in chatbot conversation practice by grounding a dialogue response generation model in a pedagogical repository of grammar skills. We also explore how this control helps learners to produce specific grammar. We comprehensively evaluate prompting, fine-tuning, and decoding strategies for grammar-controlled dialogue response generation. Strategically decoding Llama3 outperforms GPT-3.5 when tolerating minor response quality losses. Our simulation predicts grammar-controlled responses to support grammar acquisition adapted to learner proficiency. Existing language learning chatbots and research on second language acquisition benefit from these affordances. Code available on GitHub.

Interpretable Machine Learning for Societal Language Identification: Modeling English and German Influences on Portuguese Heritage Language
Soroosh Akef | Detmar Meurers | Amália Mendes | Patrick Rebuschat
Proceedings of the 14th Workshop on Natural Language Processing for Computer Assisted Language Learning

German Grammar Profile for Learners: Pedagogical Feature Definition and Automated Extraction
Denise Löfflad | Benedikt Beuttler | Detmar Meurers
Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025): Workshops

A Framework for Proficiency-Aligned Grammar Practice in LLM-Based Dialogue Systems
Luisa Ribeiro-Flucht | Xiaobin Chen | Detmar Meurers
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

Communicative practice is critical for second language development, yet learners often lack targeted, engaging opportunities to use new grammar structures. While large language models (LLMs) can offer coherent interactions, they are not inherently aligned with pedagogical goals or proficiency levels. In this paper, we explore how LLMs can be integrated into a structured framework for contextually-constrained, grammar-focused interaction, building on an existing goal-oriented dialogue system. Through controlled simulations, we evaluate five LLMs across 75 A2-level tasks under two conditions: (i) grammar-targeted, task-anchored prompting and (ii) the addition of a lightweight post-generation validation pipeline using a grammar annotator.Our findings show that template-based prompting alone substantially increases target-form coverage up to 91.4% for LLaMA 3.1-70B-Instruct, while reducing overly advanced grammar usage. The validation pipeline provides an additional boost in form-focused tasks, raising coverage to 96.3% without significantly degrading appropriateness.

2024

Investigating the Generalizability of Portuguese Readability Assessment Models Trained Using Linguistic Complexity Features
Soroosh Akef | Amália Mendes | Detmar Meurers | Patrick Rebuschat
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1

Towards Fine-Grained Pedagogical Control over English Grammar Complexity in Educational Text Generation
Dominik Glandorf | Detmar Meurers
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Teaching foreign languages and fostering language awareness in subject matter teaching requires a profound knowledge of grammar structures. Yet, while Large Language Models can act as tutors, it is unclear how effectively they can control grammar in generated text and adapt to learner needs. In this study, we investigate the ability of these models to exemplify pedagogically relevant grammar patterns, detect instances of grammar in a given text, and constrain text generation to grammar characteristic of a proficiency level. Concretely, we (1) evaluate the ability of GPT3.5 and GPT4 to generate example sentences for the standard English Grammar Profile CEFR taxonomy using few-shot in-context learning, (2) train BERT-based detectors with these generated examples of grammatical patterns, and (3) control the grammatical complexity of text generated by the open Mistral model by ranking sentence candidates with these detectors. We show that the grammar pattern instantiation quality is accurate but too homogeneous, and our classifiers successfully detect these patterns. A GPT-generated dataset of almost 1 million positive and negative examples for the English Grammar Profile is released with this work. With our method, Mistral’s output significantly increases the number of characteristic grammar constructions on the desired level, outperforming GPT4. This showcases how language domain knowledge can enhance Large Language Models for specific education needs, facilitating their effective use for intelligent tutor development and AI-generated materials. Code, models, and data are available at https://github.com/dominikglandorf/LLM-grammar.

Explainable AI in Language Learning: Linking Empirical Evidence and Theoretical Concepts in Proficiency and Readability Modeling of Portuguese
Luisa Ribeiro-Flucht | Xiaobin Chen | Detmar Meurers
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

While machine learning methods have supported significantly improved results in education research, a common deficiency lies in the explainability of the result. Explainable AI (XAI) aims to fill that gap by providing transparent, conceptually understandable explanations for the classification decisions, enhancing human comprehension and trust in the outcomes. This paper explores an XAI approach to proficiency and readability assessment employing a comprehensive set of 465 linguistic complexity measures. We identify theoretical descriptions associating such measures with varying levels of proficiency and readability and validate them using cross-corpus experiments employing supervised machine learning and Shapley Additive Explanations. The results not only highlight the utility of a diverse set of complexity measures in effectively modeling proficiency and readability in Portuguese, achieving a state-of-the-art accuracy of 0.70 in the proficiency classification task and of 0.84 in the readability classification task, but they largely corroborate the theoretical research assumptions, especially in the lexical domain.

2023

Reconciling Adaptivity and Task Orientation in the Student Dashboard of an Intelligent Language Tutoring System
Leona Colling | Tanja Heck | Detmar Meurers
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

In intelligent language tutoring systems, student dashboards should display the learning progress and performance and support the navigation through the learning content. Designing an interface that transparently offers information on students’ learning in relation to specific learning targets while linking to the overarching functional goal, that motivates and organizes the practice in current foreign language teaching, is challenging. This becomes even more difficult in systems that adaptively expose students to different learning material and individualize system interactions. If such a system is used in an ecologically valid setting of blended learning, this generates additional requirements to incorporate the needs of students and teachers for control and customizability.We present the conceptual design of a student dashboard for a task-based, user-adaptive intelligent language tutoring system intended for use in real-life English classes in secondary schools. We highlight the key challenges and spell out open questions for future research.

On the relevance and learner dependence of co-text complexity for exercise difficulty
Tanja Heck | Detmar Meurers
Proceedings of the 12th Workshop on NLP for Computer Assisted Language Learning

Using Learning Analytics for Adaptive Exercise Generation
Tanja Heck | Detmar Meurers
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

Single Choice exercises constitute a central exercise type for language learning in a learner’s progression from mere implicit exposure through input enhancement to productive language use in open exercises. Distractors that support learning in the individual zone of proximal development should not be derived from static analyses of learner corpora, but rely on dynamic learning analytics based on half-open exercises. We demonstrate how a system’s error diagnosis module can be re-used for automatic and dynamic generation and adaptation of distractors, as well as to inform exercise generation in terms of relevant learning goals and reasonable chunking in Jumbled Sentences exercises.

2022

Parametrizable exercise generation from authentic texts: Effectively targeting the language means on the curriculum
Tanja Heck | Detmar Meurers
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

We present a parametrizable approach to exercise generation from authentic texts that addresses the need for digital materials designed to practice the language means on the curriculum in a real-life school setting. The tool builds on a language-aware searchengine that helps identify attractive texts rich in the language means to be practiced. Making use of state-of-the-art NLP, the relevant learning targets are identified and transformed intoexercise items embedded in the original context. While the language-aware search engine ensures that these contexts match the learner‘s interests based on the search term used, and the linguistic parametrization of the system then reranks the results to prioritize texts that richly represent the learning targets, for theexercise generation to proceed on this basis, an interactive configuration panel allows users to adjust exercise complexity through a range of parameters specifying both properties of thesource sentences and of the exercises. An evaluation of exercises generated from web documents for a representative sample of language means selected from the English curriculum of 7th grade in German secondary school showed that the ombination of language-aware search and exercise generationsuccessfully facilitates the process of generating exercises from authentic texts that support practice of the pedagogical targets.

Assessing sentence readability for German language learners with broad linguistic modeling or readability formulas: When do linguistic insights make a difference?
Zarah Weiss | Detmar Meurers
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

We present a new state-of-the-art sentence-wise readability assessment model for German L2 readers. We build a linguistically broadly informed machine learning model and compare its performance against four commonly used readability formulas. To understand when the linguistic insights used to inform our model make a difference for readability assessment and when simple readability formulas suffice, we compare their performance based on two common automatic readability assessment tasks: predictive regression and sentence pair ranking. We find that leveraging linguistic insights yields top performances across tasks, but that for the identification of simplified sentences also readability formulas – which are easier to compute and more accessible – can be sufficiently precise. Linguistically informed modeling, however, is the only viable option for high quality outcomes in fine-grained prediction tasks. We then explore the sentence-wise readability profile of leveled texts written for language learners at a beginning, intermediate, and advanced level of German to showcase the valuable insights that sentence-wise readability assessment can have for the adaptation of learning materials and better understand how sentences’ individual readability contributes to larger texts’ overall readability.

Generating and authoring high-variability exercises from authentic texts
Tanja Heck | Detmar Meurers
Proceedings of the 11th Workshop on NLP for Computer Assisted Language Learning

2021

Automatic annotation of curricular language targets to enrich activity models and support both pedagogy and adaptive systems
Martí Quixal | Björn Rudzewitz | Elizabeth Bear | Detmar Meurers
Proceedings of the 10th Workshop on NLP for Computer Assisted Language Learning

Employing distributional semantics to organize task-focused vocabulary learning
Haemanth Santhi Ponnusamy | Detmar Meurers
Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications

How can a learner systematically prepare for reading a book they are interested in? In this paper, we explore how computational linguistic methods such as distributional semantics, morphological clustering, and exercise generation can be combined with graph-based learner models to answer this question both conceptually and in practice. Based on highly structured learner models and concepts from network analysis, the learner is guided to efficiently explore the targeted lexical space. They practice using multi-gap learning activities generated from the book. In sum, the approach combines computational linguistic methods with concepts from network analysis and tutoring systems to support learners in pursuing their individual reading task goals.

Advancing Neural Question Generation for Formal Pragmatics: Learning when to generate and when to copy
Kordula De Kuthy | Madeeswaran Kannan | Haemanth Santhi Ponnusamy | Detmar Meurers
Proceedings of the First Workshop on Integrating Perspectives on Discourse Annotation

Using Broad Linguistic Complexity Modeling for Cross-Lingual Readability Assessment
Zarah Weiss | Xiaobin Chen | Detmar Meurers
Proceedings of the 10th Workshop on NLP for Computer Assisted Language Learning

Broad Linguistic Complexity Analysis for Greek Readability Classification
Savvas Chatzipanagiotidis | Maria Giagkou | Detmar Meurers
Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications

This paper explores the linguistic complexity of Greek textbooks as a readability classification task. We analyze textbook corpora for different school subjects and textbooks for Greek as a Second Language, covering a very wide spectrum of school age groups and proficiency levels. A broad range of quantifiable linguistic complexity features (lexical, morphological and syntactic) are extracted and calculated. Conducting experiments with different feature subsets, we show that the different linguistic dimensions contribute orthogonal information, each contributing towards the highest result achieved using all linguistic feature subsets. A readability classifier trained on this basis reaches a classification accuracy of 88.16% for the Greek as a Second Language corpus. To investigate the generalizability of the classification models, we also perform cross-corpus evaluations. We show that the model trained on the most varied text collection (for Greek as a school subject) generalizes best. In addition to advancing the state of the art for Greek readability analysis, the paper also contributes insights on the role of different feature sets and training setups for generalizable readability classification.

Exploring Input Representation Granularity for Generating Questions Satisfying Question-Answer Congruence
Madeeswaran Kannan | Haemanth Santhi Ponnusamy | Kordula De Kuthy | Lukas Stein | Detmar Meurers
Proceedings of the 14th International Conference on Natural Language Generation

In question generation, the question produced has to be well-formed and meaningfully related to the answer serving as input. Neural generation methods have predominantly leveraged the distributional semantics of words as representations of meaning and generated questions one word at a time. In this paper, we explore the viability of form-based and more fine-grained encodings, such as character or subword representations for question generation. We start from the typical seq2seq architecture using word embeddings presented by De Kuthy et al. (2020), who generate questions from text so that the answer given in the input text matches not just in meaning but also in form, satisfying question-answer congruence. We show that models trained on character and subword representations substantially outperform the published results based on word embeddings, and they do so with fewer parameters. Our approach eliminates two important problems of the word-based approach: the encoding of rare or out-of-vocabulary words and the incorrect replacement of words with semantically-related ones. The character-based model substantially improves on the published results, both in terms of BLEU scores and regarding the quality of the generated question. Going beyond the specific task, this result adds to the evidence weighing different form- and meaning-based representations for natural language processing tasks.

2020

Towards automatically generating Questions under Discussion to link information and discourse structure
Kordula De Kuthy | Madeeswaran Kannan | Haemanth Santhi Ponnusamy | Detmar Meurers
Proceedings of the 28th International Conference on Computational Linguistics

Questions under Discussion (QUD; Roberts, 2012) are emerging as a conceptually fruitful approach to spelling out the connection between the information structure of a sentence and the nature of the discourse in which the sentence can function. To make this approach useful for analyzing authentic data, Riester, Brunetti & De Kuthy (2018) presented a discourse annotation framework based on explicit pragmatic principles for determining a QUD for every assertion in a text. De Kuthy et al. (2018) demonstrate that this supports more reliable discourse structure annotation, and Ziai and Meurers (2018) show that based on explicit questions, automatic focus annotation becomes feasible. But both approaches are based on manually specified questions. In this paper, we present an automatic question generation approach to partially automate QUD annotation by generating all potentially relevant questions for a given sentence. While transformation rules can concisely capture the typical question formation process, a rule-based approach is not sufficiently robust for authentic data. We therefore employ the transformation rules to generate a large set of sentence-question-answer triples and train a neural question generation model on them to obtain both systematic question type coverage and robustness.

2019

The Impact of Spelling Correction and Task Context on Short Answer Assessment for Intelligent Tutoring Systems
Ramon Ziai | Florian Nuxoll | Kordula De Kuthy | Björn Rudzewitz | Detmar Meurers
Proceedings of the 8th Workshop on NLP for Computer Assisted Language Learning

Analyzing Linguistic Complexity and Accuracy in Academic Language Development of German across Elementary and Secondary School
Zarah Weiss | Detmar Meurers
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

We track the development of writing complexity and accuracy in German students’ early academic language development from first to eighth grade. Combining an empirically broad approach to linguistic complexity with the high-quality error annotation included in the Karlsruhe Children’s Text corpus (Lavalley et al. 2015) used, we construct models of German academic language development that successfully identify the student’s grade level. We show that classifiers for the early years rely more on accuracy development, whereas development in secondary school is better characterized by increasingly complex language in all domains: linguistic system, language use, and human sentence processing characteristics. We demonstrate the generalizability and robustness of models using such a broad complexity feature set across writing topics.

Integrating large-scale web data and curated corpus data in a search engine supporting German literacy education
Sabrina Dittrich | Zarah Weiss | Hannes Schröter | Detmar Meurers
Proceedings of the 8th Workshop on NLP for Computer Assisted Language Learning

Computationally Modeling the Impact of Task-Appropriate Language Complexity and Accuracy on Human Grading of German Essays
Zarah Weiss | Anja Riemenschneider | Pauline Schröter | Detmar Meurers
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Computational linguistic research on the language complexity of student writing typically involves human ratings as a gold standard. However, educational science shows that teachers find it difficult to identify and cleanly separate accuracy, different aspects of complexity, contents, and structure. In this paper, we therefore explore the use of computational linguistic methods to investigate how task-appropriate complexity and accuracy relate to the grading of overall performance, content performance, and language performance as assigned by teachers. Based on texts written by students for the official school-leaving state examination (Abitur), we show that teachers successfully assign higher language performance grades to essays with higher task-appropriate language complexity and properly separate this from content scores. Yet, accuracy impacts teacher assessment for all grading rubrics, also the content score, overemphasizing the role of accuracy. Our analysis is based on broad computational linguistic modeling of German language complexity and an innovative theory- and data-driven feature aggregation method inferring task-appropriate language complexity.

2018

Automatic Input Enrichment for Selecting Reading Material: An Online Study with English Teachers
Maria Chinkina | Ankita Oswal | Detmar Meurers
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

Input material at the appropriate level is crucial for language acquisition. Automating the search for such material can systematically and efficiently support teachers in their pedagogical practice. This is the goal of the computational linguistic task of automatic input enrichment (Chinkina & Meurers, 2016): It analyzes and re-ranks a collection of texts in order to prioritize those containing target linguistic forms. In the online study described in the paper, we collected 240 responses from English teachers in order to investigate whether they preferred automatic input enrichment over web search when selecting reading material for class. Participants demonstrated a general preference for the material provided by an automatic input enrichment system. It was also rated significantly higher than the texts retrieved by a standard web search engine with regard to the representation of linguistic forms and equivalent with regard to the relevance of the content to the topic. We discuss the implications of the results for language teaching and consider the potential strands of future research.

A Linguistically-Informed Search Engine to Identifiy Reading Material for Functional Illiteracy Classes
Zarah Weiss | Sabrina Dittrich | Detmar Meurers
Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning

Generating Feedback for English Foreign Language Exercises
Björn Rudzewitz | Ramon Ziai | Kordula De Kuthy | Verena Möller | Florian Nuxoll | Detmar Meurers
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

While immediate feedback on learner language is often discussed in the Second Language Acquisition literature (e.g., Mackey 2006), few systems used in real-life educational settings provide helpful, metalinguistic feedback to learners. In this paper, we present a novel approach leveraging task information to generate the expected range of well-formed and ill-formed variability in learner answers along with the required diagnosis and feedback. We combine this offline generation approach with an online component that matches the actual student answers against the pre-computed hypotheses. The results obtained for a set of 33 thousand answers of 7th grade German high school students learning English show that the approach successfully covers frequent answer patterns. At the same time, paraphrases and content errors require a more flexible alignment approach, for which we are planning to complement the method with the CoMiC approach successfully used for the analysis of reading comprehension answers (Meurers et al., 2011).

Automatic Focus Annotation: Bringing Formal Pragmatics Alive in Analyzing the Information Structure of Authentic Data
Ramon Ziai | Detmar Meurers
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Analyzing language in context, both from a theoretical and from a computational perspective, is receiving increased interest. Complementing the research in linguistics on discourse and information structure, in computational linguistics identifying discourse concepts was also shown to improve the performance of certain applications, for example, Short Answer Assessment systems (Ziai and Meurers, 2014). Building on the research that established detailed annotation guidelines for manual annotation of information structural concepts for written (Dipper et al., 2007; Ziai and Meurers, 2014) and spoken language data (Calhoun et al., 2010), this paper presents the first approach automating the analysis of focus in authentic written data. Our classification approach combines a range of lexical, syntactic, and semantic features to achieve an accuracy of 78.1% for identifying focus.

Modeling the Readability of German Targeting Adults and Children: An empirically broad analysis and its cross-corpus validation
Zarah Weiß | Detmar Meurers
Proceedings of the 27th International Conference on Computational Linguistics

We analyze two novel data sets of German educational media texts targeting adults and children. The analysis is based on 400 automatically extracted measures of linguistic complexity from a wide range of linguistic domains. We show that both data sets exhibit broad linguistic adaptation to the target audience, which generalizes across both data sets. Our most successful binary classification model for German readability robustly shows high accuracy between 89.4%–98.9% for both data sets. To our knowledge, this comprehensive German readability model is the first for which robust cross-corpus performance has been shown. The research also contributes resources for German readability assessment that are externally validated as successful for different target audiences: we compiled a new corpus of German news broadcast subtitles, the Tagesschau/Logo corpus, and crawled a GEO/GEOlino corpus substantially enlarging the data compiled by Hancke et al. 2012.

Feedback Strategies for Form and Meaning in a Real-life Language Tutoring System
Ramon Ziai | Bjoern Rudzewitz | Kordula De Kuthy | Florian Nuxoll | Detmar Meurers
Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning

COAST - Customizable Online Syllable Enhancement in Texts. A flexible framework for automatically enhancing reading materials
Heiko Holz | Zarah Weiss | Oliver Brehm | Detmar Meurers
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

This paper presents COAST, a web-based application to easily and automatically enhance syllable structure, word stress, and spacing in texts, that was designed in close collaboration with learning therapists to ensure its practical relevance. Such syllable-enhanced texts are commonly used in learning therapy or private tuition to promote the recognition of syllables in order to improve reading and writing skills. In a state of the art solutions for automatic syllable enhancement, we put special emphasis on syllable stress and support specific marking of the primary syllable stress in words. Core features of our tool are i) a highly customizable text enhancement and template functionality, and ii) a novel crowd-sourcing mechanism that we employ to address the issue of data sparsity in language resources. We successfully tested COAST with real-life practitioners in a series of user tests validating the concept of our framework.

2017

Challenging learners in their individual zone of proximal development using pedagogic developmental benchmarks of syntactic complexity
Xiaobin Chen | Detmar Meurers
Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition

Question Generation for Language Learning: From ensuring texts are read to supporting learning
Maria Chinkina | Detmar Meurers
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

In Foreign Language Teaching and Learning (FLTL), questions are systematically used to assess the learner’s understanding of a text. Computational linguistic approaches have been developed to generate such questions automatically given a text (e.g., Heilman, 2011). In this paper, we want to broaden the perspective on the different functions questions can play in FLTL and discuss how automatic question generation can support the different uses. Complementing the focus on meaning and comprehension, we want to highlight the fact that questions can also be used to make learners notice form aspects of the linguistic system and their interpretation. Automatically generating questions that target linguistic forms and grammatical categories in a text in essence supports incidental focus-on-form (Loewen, 2005) in a meaning-focused reading task. We discuss two types of questions serving this purpose, how they can be generated automatically; and we report on a crowd-sourcing evaluation comparing automatically generated to manually written questions targeting particle verbs, a challenging linguistic form for learners of English.

Developing a web-based workbook for English supporting the interaction of students and teachers
Björn Rudzewitz | Ramon Ziai | Kordula De Kuthy | Detmar Meurers
Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition

2016

CTAP: A Web-Based Tool Supporting Automatic Complexity Analysis
Xiaobin Chen | Detmar Meurers
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

Informed by research on readability and language acquisition, computational linguists have developed sophisticated tools for the analysis of linguistic complexity. While some tools are starting to become accessible on the web, there still is a disconnect between the features that can in principle be identified based on state-of-the-art computational linguistic analysis, and the analyses a second language acquisition researcher, teacher, or textbook writer can readily obtain and visualize for their own collection of texts. This short paper presents a web-based tool development that aims to meet this challenge. The Common Text Analysis Platform (CTAP) is designed to support fully configurable linguistic feature extraction for a wide range of complexity analyses. It features a user-friendly interface, modularized and reusable analysis component integration, and flexible corpus and feature management. Building on the Unstructured Information Management framework (UIMA), CTAP readily supports integration of state-of-the-art NLP and complexity feature extraction maintaining modularization and reusability. CTAP thereby aims at providing a common platform for complexity analysis, encouraging research collaboration and sharing of feature extraction components—to jointly advance the state-of-the-art in complexity analysis in a form that readily supports real-life use by ordinary users.

Linguistically Aware Information Retrieval: Providing Input Enrichment for Second Language Learners
Maria Chinkina | Detmar Meurers
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

Focus Annotation of Task-based Data: Establishing the Quality of Crowd Annotation
Kordula De Kuthy | Ramon Ziai | Detmar Meurers
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)

Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus
Kordula De Kuthy | Ramon Ziai | Detmar Meurers
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

While the formal pragmatic concepts in information structure, such as the focus of an utterance, are precisely defined in theoretical linguistics and potentially very useful in conceptual and practical terms, it has turned out to be difficult to reliably annotate such notions in corpus data. We present a large-scale focus annotation effort designed to overcome this problem. Our annotation study is based on the tasked-based corpus CREG, which consists of answers to explicitly given reading comprehension questions. We compare focus annotation by trained annotators with a crowd-sourcing setup making use of untrained native speakers. Given the task context and an annotation process incrementally making the question form and answer type explicit, the trained annotators reach substantial agreement for focus annotation. Interestingly, the crowd-sourcing setup also supports high-quality annotation ― for specific subtypes of data. Finally, we turn to the question whether the relevance of focus annotation can be extrinsically evaluated. We show that automatic short-answer assessment significantly improves for focus annotated data. The focus annotated CREG corpus is freely available and constitutes the largest such resource for German.

Advancing Linguistic Features and Insights by Label-informed Feature Grouping: An Exploration in the Context of Native Language Identification
Serhiy Bykh | Detmar Meurers
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We propose a hierarchical clustering approach designed to group linguistic features for supervised machine learning that is inspired by variationist linguistics. The method makes it possible to abstract away from the individual feature occurrences by grouping features together that behave alike with respect to the target class, thus providing a new, more general perspective on the data. On the one hand, it reduces data sparsity, leading to quantitative performance gains. On the other, it supports the formation and evaluation of hypotheses about individual choices of linguistic structures. We explore the method using features based on verb subcategorization information and evaluate the approach in the context of the Native Language Identification (NLI) task.

Online Information Retrieval for Language Learning
Maria Chinkina | Madeeswaran Kannan | Detmar Meurers
Proceedings of ACL-2016 System Demonstrations

Characterizing Text Difficulty with Word Frequencies
Xiaobin Chen | Detmar Meurers
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

Approximating Givenness in Content Assessment through Distributional Semantics
Ramon Ziai | Kordula De Kuthy | Detmar Meurers
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics

Towards grounding computational linguistic approaches to readability: Modeling reader-text interaction for easy and difficult texts
Sowmya Vajjala | Detmar Meurers | Alexander Eitel | Katharina Scheiter
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

Computational approaches to readability assessment are generally built and evaluated using gold standard corpora labeled by publishers or teachers rather than being grounded in observations about human performance. Considering that both the reading process and the outcome can be observed, there is an empirical wealth that could be used to ground computational analysis of text readability. This will also support explicit readability models connecting text complexity and the reader’s language proficiency to the reading process and outcomes. This paper takes a step in this direction by reporting on an experiment to study how the relation between text complexity and reader’s language proficiency affects the reading process and performance outcomes of readers after reading We modeled the reading process using three eye tracking variables: fixation count, average fixation count, and second pass reading duration. Our models for these variables explained 78.9%, 74% and 67.4% variance, respectively. Performance outcome was modeled through recall and comprehension questions, and these models explained 58.9% and 27.6% of the variance, respectively. While the online models give us a better understanding of the cognitive correlates of reading with text complexity and language proficiency, modeling of the offline measures can be particularly relevant for incorporating user aspects into readability models.

2014

The MERLIN corpus: Learner language and the CEFR
Adriane Boyd | Jirka Hana | Lionel Nicolas | Detmar Meurers | Katrin Wisniewski | Andrea Abel | Karin Schöne | Barbora Štindlová | Chiara Vettori
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The MERLIN corpus is a written learner corpus for Czech, German,and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR) with authentic learner data. The corpus contains 2,290 learner texts produced in standardized language certifications covering CEFR levels A1-C1. The MERLIN annotation scheme includes a wide range of language characteristics that enable research into the empirical foundations of the CEFR scales and provide language teachers, test developers, and Second Language Acquisition researchers with concrete examples of learner performance and progress across multiple proficiency levels. For computational linguistics, it provide a range of authentic learner data for three target languages, supporting a broadening of the scope of research in areas such as automatic proficiency classification or native language identification. The annotated corpus and related information will be freely available as a corpus resource and through a freely accessible, didactically-oriented online platform.

Assessing the relative reading level of sentence pairs for text simplification
Sowmya Vajjala | Detmar Meurers
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

Exploring Syntactic Features for Native Language Identification: A Variationist Perspective on Feature Encoding and Ensemble Optimization
Serhiy Bykh | Detmar Meurers
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

CLARA: A New Generation of Researchers in Common Language Resources and Their Applications
Koenraad De Smedt | Erhard Hinrichs | Detmar Meurers | Inguna Skadiņa | Bolette Pedersen | Costanza Navarretta | Núria Bel | Krister Lindén | Markéta Lopatková | Jan Hajič | Gisle Andersen | Przemyslaw Lenkiewicz
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

CLARA (Common Language Resources and Their Applications) is a Marie Curie Initial Training Network which ran from 2009 until 2014 with the aim of providing researcher training in crucial areas related to language resources and infrastructure. The scope of the project was broad and included infrastructure design, lexical semantic modeling, domain modeling, multimedia and multimodal communication, applications, and parsing technologies and grammar models. An international consortium of 9 partners and 12 associate partners employed researchers in 19 new positions and organized a training program consisting of 10 thematic courses and summer/winter schools. The project has resulted in new theoretical insights as well as new resources and tools. Most importantly, the project has trained a new generation of researchers who can perform advanced research and development in language resources and technologies.

A VIEW of Russian: Visual Input Enhancement and Adaptive Feedback
Robert Reynolds | Eduard Schaf | Detmar Meurers
Proceedings of the third workshop on NLP for computer-assisted language learning

Exploring Measures of “Readability” for Spoken Language: Analyzing linguistic features of subtitles to identify age-specific TV programs
Sowmya Vajjala | Detmar Meurers
Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR)

Focus Annotation in Reading Comprehension Data
Ramon Ziai | Detmar Meurers
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop

2013

Combining Shallow and Linguistically Motivated Features in Native Language Identification
Serhiy Bykh | Sowmya Vajjala | Julia Krivanek | Detmar Meurers
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

CoMeT: Integrating different levels of linguistic modeling for meaning assessment
Niels Ott | Ramon Ziai | Michael Hahn | Detmar Meurers
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

On The Applicability of Readability Models to Web Texts
Sowmya Vajjala | Detmar Meurers
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations

2012

Readability Classification for German using Lexical, Syntactic, and Morphological Features
Julia Hancke | Sowmya Vajjala | Detmar Meurers
Proceedings of COLING 2012

Native Language Identification using Recurring n-grams – Investigating Abstraction and Domain Dependence
Serhiy Bykh | Detmar Meurers
Proceedings of COLING 2012

Short Answer Assessment: Establishing Links Between Research Strands
Ramon Ziai | Niels Ott | Detmar Meurers
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

Informing Determiner and Preposition Error Correction with Hierarchical Word Clustering
Adriane Boyd | Marion Zepf | Detmar Meurers
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

Evaluating the Meaning of Answers to Reading Comprehension Questions: A Semantics-Based Approach
Michael Hahn | Detmar Meurers
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition
Sowmya Vajjala | Detmar Meurers
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

2011

Automatic Sentiment Classification of Product Reviews Using Maximal Phrases Based Analysis
Maria Tchalakova | Dale Gerdemann | Detmar Meurers
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

Data-Driven Correction of FunctionWords in Non-Native English
Adriane Boyd | Detmar Meurers
Proceedings of the 13th European Workshop on Natural Language Generation

Evaluating Answers to Reading Comprehension Questions in Context: Results for German and the Role of Information Structure
Detmar Meurers | Ramon Ziai | Niels Ott | Janina Kopp
Proceedings of the TextInfer 2011 Workshop on Textual Entailment

2010

Emotional Perception of Fairy Tales: Achieving Agreement in Emotion Annotation of Text
Ekaterina P. Volkova | Betty Mohler | Detmar Meurers | Dale Gerdemann | Heinrich H. Bülthoff
Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text

Enhancing Authentic Web Pages for Language Learners
Detmar Meurers | Ramon Ziai | Luiz Amaral | Adriane Boyd | Aleksandar Dimitrov | Vanessa Metcalf | Niels Ott
Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications

Exploring the Data-Driven Prediction of Prepositions in English
Anas Elghafari | Detmar Meurers | Holger Wunsch
Coling 2010: Posters

2008

Diagnosing Meaning Errors in Short Answers to Reading Comprehension Questions
Stacey Bailey | Detmar Meurers
Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications

Revisiting the Impact of Different Annotation Schemes on PCFG Parsing: A Grammatical Dependency Evaluation
Adriane Boyd | Detmar Meurers
Proceedings of the Workshop on Parsing German

2005

“Language and Computers”: Creating an Introduction for a General Undergraduate Audience
Chris Brew | Markus Dickinson | W. Detmar Meurers
Proceedings of the Second ACL Workshop on Effective Tools and Methodologies for Teaching NLP and CL

Detecting Errors in Discontinuous Structural Annotation
Markus Dickinson | W. Detmar Meurers
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

A Grammar Formalism and Parser for Linearization-based HPSG
Michael W. Daniels | W. Detmar Meurers
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

Detecting Errors in Part-of-Speech Annotation
Markus Dickinson | W. Detmar Meurers
10th Conference of the European Chapter of the Association for Computational Linguistics

2002

A Web-based Instructional Platform for Contraint-Based Grammar Formalisms and Parsing
W. Detmar Meurers | Gerald Penn | Frank Richter
Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics

1997

A Computational Treatment of Lexical Rules in HPSG as Covariation in Lexical Entries
W. Detmar Meurers | Guido Minnen
Computational Linguistics, Volume 23, Number 4, December 1997

Interleaving Universal Principles and Relational Constraints over Typed Feature Logic
Thilo Gotz | Detmar Meurers
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

The ConTroll System as Large Grammar Development Platform
Thilo Gotz | Walt Detmar Meurers
Computational Environments for Grammar Development and Linguistic Engineering

1995

Compiling HPSG type constraints into definite clause programs
Thilo Gotz | Walt Detmar Meurers
33rd Annual Meeting of the Association for Computational Linguistics

Co-authors

Björn Rudzewitz 5

Maria Chinkina 4

Madeeswaran Kannan 4

Haemanth Santhi Ponnusamy 4

Markus Dickinson 3

Florian Nuxoll 3

Sabrina Dittrich 2

Dale Gerdemann 2

Dominik Glandorf 2

Amália Mendes 2

Patrick Rebuschat 2

Luisa Ribeiro-Flucht 2

Gisle Andersen 1

Stacey Bailey 1

Elizabeth Bear 1

Benedikt Beuttler 1

Heinrich H. Bülthoff 1

Savvas Chatzipanagiotidis 1

Leona Colling 1

Michael W. Daniels 1

Koenraad De Smedt 1

Aleksandar Dimitrov 1

Alexander Eitel 1

Anas Elghafari 1

Maria Giagkou 1

Leander Girrbach 1

Erhard Hinrichs 1

Julia Krivanek 1

Przemyslaw Lenkiewicz 1

Krister Lindén 1

Marketa Lopatkova 1

Denise Löfflad 1

Vanessa Metcalf 1

Verena Möller 1

Costanza Navarretta 1

Lionel Nicolas 1

Bolette Sandford Pedersen 1

Martí Quixal 1

Robert Reynolds 1

Frank Richter 1

Anja Riemenschneider 1

Mrinmaya Sachan 1

Katharina Scheiter 1

Hannes Schröter 1

Pauline Schröter 1

Karin Schöne 1

Inguna Skadiņa 1

Maria Tchalakova 1

Chiara Vettori 1

Ekaterina P. Volkova 1

Katrin Wisniewski 1

Holger Wunsch 1

Barbora Štindlová 1

Venues