This paper presents a fine-grained test suite for the language pair German–English. The test suite is based on a number of linguistically motivated categories and phenomena and the semi-automatic evaluation is carried out with regular expressions. We describe the creation and implementation of the test suite in detail, providing a full list of all categories and phenomena. Furthermore, we present various exemplary applications of our test suite that have been implemented in the past years, like contributions to the Conference of Machine Translation, the usage of the test suite and MT outputs for quality estimation, and the expansion of the test suite to the language pair Portuguese–English. We describe how we tracked the development of the performance of various systems MT systems over the years with the help of the test suite and which categories and phenomena are prone to resulting in MT errors. For the first time, we also make a large part of our test suite publicly available to the research community.
We present the results of the application of a grammatical test suite for German-to-English MT on the systems submitted at WMT19, with a detailed analysis for 107 phenomena organized in 14 categories. The systems still translate wrong one out of four test items in average. Low performance is indicated for idioms, modals, pseudo-clefts, multi-word expressions and verb valency. When compared to last year, there has been a improvement of function words, non verbal agreement and punctuation. More detailed conclusions about particular systems and phenomena are also presented.
We present an analysis of 16 state-of-the-art MT systems on German-English based on a linguistically-motivated test suite. The test suite has been devised manually by a team of language professionals in order to cover a broad variety of linguistic phenomena that MT often fails to translate properly. It contains 5,000 test sentences covering 106 linguistic phenomena in 14 categories, with an increased focus on verb tenses, aspects and moods. The MT outputs are evaluated in a semi-automatic way through regular expressions that focus only on the part of the sentence that is relevant to each phenomenon. Through our analysis, we are able to compare systems based on their performance on these categories. Additionally, we reveal strengths and weaknesses of particular systems and we identify grammatical phenomena where the overall performance of MT is relatively low.
The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.
Relation extraction is the task of recognizing and extracting relations between entities or concepts in texts. A common approach is to exploit existing knowledge to learn linguistic patterns expressing the target relation and use these patterns for extracting new relation mentions. Deriving relation patterns automatically usually results in large numbers of candidates, which need to be filtered to derive a subset of patterns that reliably extract correct relation mentions. We address the pattern selection task by exploiting the knowledge represented by entailment graphs, which capture semantic relationships holding among the learned pattern candidates. This is motivated by the fact that a pattern may not express the target relation explicitly, but still be useful for extracting instances for which the relation holds, because its meaning entails the meaning of the target relation. We evaluate the usage of both automatically generated and gold-standard entailment graphs in a relation extraction scenario and present favorable experimental results, exhibiting the benefits of structuring and selecting patterns based on entailment graphs.
Web debates play an important role in enabling broad participation of constituencies in social, political and economic decision-taking. However, it is challenging to organize, structure, and navigate a vast number of diverse argumentations and comments collected from many participants over a long time period. In this paper we demonstrate Common Round, a next generation platform for large-scale web debates, which provides functions for eliciting the semantic content and structures from the contributions of participants. In particular, Common Round applies language technologies for the extraction of semantic essence from textual input, aggregation of the formulated opinions and arguments. The platform also provides a cross-lingual access to debates using machine translation.
Radiology reports express the results of a radiology study and contain information about anatomical entities, findings, measures and impressions of the medical doctor. The use of information extraction techniques can help physicians to access this information in order to understand data and to infer further knowledge. Supervised machine learning methods are very popular to address information extraction, but are usually domain and language dependent. To train new classification models, annotated data is required. Moreover, annotated data is also required as an evaluation resource of information extraction algorithms. However, one major drawback of processing clinical data is the low availability of annotated datasets. For this reason we performed a manual annotation of radiology reports written in Spanish. This paper presents the corpus, the annotation schema, the annotation guidelines and further insight of the data.
A common reason for errors in coreference resolution is the lack of semantic information to help determine the compatibility between mentions referring to the same entity. Distributed representations, which have been shown successful in encoding relatedness between words, could potentially be a good source of such knowledge. Moreover, being obtained in an unsupervised manner, they could help address data sparsity issues in labeled training data at a small cost. In this work we investigate whether and to what extend features derived from word embeddings can be successfully used for supervised coreference resolution. We experiment with several word embedding models, and several different types of embeddingbased features, including embedding cluster and cosine similarity-based features. Our evaluations show improvements in the performance of a supervised state-of-theart coreference system.
A huge body of continuously growing written knowledge is available on the web in the form of social media posts, RSS feeds, and news articles. Real-time information extraction from such high velocity, high volume text streams requires scalable, distributed natural language processing pipelines. We introduce such a system for fine-grained event recognition within the big data framework Flink, and demonstrate its capabilities for extracting and geo-locating mobility- and industry-related events from heterogeneous text sources. Performance analyses conducted on several large datasets show that our system achieves high throughput and maintains low latency, which is crucial when events need to be detected and acted upon in real-time. We also present promising experimental results for the event extraction component of our system, which recognizes a novel set of event types. The demo system is available at http://dfki.de/sd4m-sta-demo/.
Recent research shows the importance of linking linguistic knowledge resources for the creation of large-scale linguistic data. We describe our approach for combining two English resources, FrameNet and sar-graphs, and illustrate the benefits of the linked data in a relation extraction setting. While FrameNet consists of schematic representations of situations, linked to lexemes and their valency patterns, sar-graphs are knowledge resources that connect semantic relations from factual knowledge graphs to the linguistic phrases used to express instances of these relations. We analyze the conceptual similarities and differences of both resources and propose to link sar-graphs and FrameNet on the levels of relations/frames as well as phrases. The former alignment involves a manual ontology mapping step, which allows us to extend sar-graphs with new phrase patterns from FrameNet. The phrase-level linking, on the other hand, is fully automatic. We investigate the quality of the automatically constructed links and identify two main classes of errors.
The task of relation extraction is to recognize and extract relations between entities or concepts in texts. Dependency parse trees have become a popular source for discovering extraction patterns, which encode the grammatical relations among the phrases that jointly express relation instances. State-of-the-art weakly supervised approaches to relation extraction typically extract thousands of unique patterns only potentially expressing the target relation. Among these patterns, some are semantically equivalent, but differ in their morphological, lexical-semantic or syntactic form. Some express a relation that entails the target relation. We propose a new approach to structuring extraction patterns by utilizing entailment graphs, hierarchical structures representing entailment relations, and present a novel resource of gold-standard entailment graphs based on a set of patterns automatically acquired using distant supervision. We describe the methodology used for creating the dataset and present statistics of the resource as well as an analysis of inference types underlying the entailment decisions.
In this work we present a fine-grained annotation schema to detect named entities in German clinical data of chronically ill patients with kidney diseases. The annotation schema is driven by the needs of our clinical partners and the linguistic aspects of German language. In order to generate annotations within a short period, the work also presents a semi-automatic annotation which uses additional sources of knowledge such as UMLS, to pre-annotate concepts in advance. The presented schema will be used to apply novel techniques from natural language processing and machine learning to support doctors treating their patients by improved information access from unstructured German texts.
An important subtask in clinical text mining tries to identify whether a clinical finding is expressed as present, absent or unsure in a text. This work presents a system for detecting mentions of clinical findings that are negated or just speculated. The system has been applied to two different types of German clinical texts: clinical notes and discharge summaries. Our approach is built on top of NegEx, a well known algorithm for identifying non-factive mentions of medical findings. In this work, we adjust a previous adaptation of NegEx to German and evaluate the system on our data to detect negation and speculation. The results are compared to a baseline algorithm and are analyzed for both types of clinical documents. Our system achieves an F1-Score above 0.9 on both types of reports.
In this paper, we report on first attempts and findings to analyzing German patient records, using a hybrid parsing architecture and a combination of two relation extraction strategies. On a practical level, we are interested in the extraction of concepts and relations among those concepts, a necessary cornerstone for building medical information systems. The parsing pipeline consists of a morphological analyzer, a robust chunk parser adapted to Latin phrases used in medical diagnosis, a repair rule stage, and a probabilistic context-free parser that respects the output from the chunker. The relation extraction stage is a combination of two systems: SProUT, a shallow processor which uses hand-written rules to discover relation instances from local text units and DARE which extracts relation instances from complete sentences, using rules that are learned in a bootstrapping process, starting with semantic seeds. Two small experiments have been carried out for the parsing pipeline and the relation extraction stage.
This paper presents a new resource for the training and evaluation needed by relation extraction experiments. The corpus consists of annotations of mentions for three semantic relations: marriage, parent―child, siblings, selected from the domain of biographic facts about persons and their social relationships. The corpus contains more than one hundred news articles from Tabloid Press. In the current corpus, we only consider the relation mentions occurring in the individual sentences. We provide multi-level annotations which specify the marked facts from relation, argument, entity, down to the token level, thus allowing for detailed analysis of linguistic phenomena and their interactions. A generic markup tool Recon developed at the DFKI LT lab has been utilised for the annotation task. The corpus has been annotated by two human experts, supported by additional conflict resolution conducted by a third expert. As shown in the evaluation, the annotation is of high quality as proved by the stated inter-annotator agreements both on sentence level and on relationmention level. The current corpus is already in active use in our research for evaluation of the relation extraction performance of our automatically learned extraction patterns.
Human translators are the key to evaluating machine translation (MT) quality and also to addressing the so far unanswered question when and how to use MT in professional translation workflows. This paper describes the corpus developed as a result of a detailed large scale human evaluation consisting of three tightly connected tasks: ranking, error classification and post-editing.
This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiatives work throughout Europe in order to boost progress and innovation in our field.
Modern language learning courses are no longer exclusively based on books or face-to-face lectures. More and more lessons make use of multimedia and personalized learning methods. Many of these are based on e-learning solutions. Learning via the Internet provides 7/24 services that require sizeable human resources. Therefore we witness a growing economic pressure to employ computer-assisted methods for improving language learning in quality, efficiency and scalability. In this paper, we will address three applications of language technologies for language learning: 1) Methods and strategies for pronunciation training in second language learning, e.g., multimodal feedback via visualization of sound features, speech verification and prosody transplantation; 2) Dialogue-based language learning games; 3) Application of parsing and generation technologies to the automatic generation of paraphrases for the semi-automatic production of learning material.
In this paper, we present a novel combination of two types of language resources dedicated to the detection of relevant relations (RE) such as events or facts across sentence boundaries. One of the two resources is the sar-graph, which aggregates for each target relation ten thousands of linguistic patterns of semantically associated relations that signal instances of the target relation (Uszkoreit and Xu, 2013). These have been learned from the Web by intra-sentence pattern extraction (Krause et al., 2012) and after semantic filtering and enriching have been automatically combined into a single graph. The other resource is cockrACE, a specially annotated corpus for the training and evaluation of cross-sentence RE. By employing our powerful annotation tool Recon, annotators mark selected entities and relations (including events), coreference relations among these entities and events, and also terms that are semantically related to the relevant relations and events. This paper describes how the two resources are created and how they complement each other.
The paper describes the evaluation of the KomParse system. KomParse is a dialogue system embedded in a 3-D massive multiplayer online game, allowing conversations between non player characters (NPCs) and game users. In a field test with game users, the system was evaluated with respect to acceptability and usability of the overall system as well as task completion, dialogue control and efficiency of three conversational tasks. Furthermore, subjective feedback has been collected for evaluating the single communication components of the system such as natural language understanding. The results are very satisfying and promising. In general, both the usability and acceptability tests show that the tested NPC is useful and well-accepted by the users. Even if the NPC does not always understand the users well and expresses things unexpected, he could still provide appropriate responses to help users to solve their problems or entertain them.
After several years of development, the vision of the Semantic Web is gradually becoming reality. Large data repositories have been created and offer semantic information in a machine-processable form for various domains. Semantic Web data can be published on the Web, gathered automatically, and reasoned about. All these developments open interesting perspectives for building a new class of domain-specific, broad-coverage information systems that overcome a long-standing bottleneck of AI systems, the notoriously incomplete knowledge base. We present a system that shows how the wealth of information in the Semantic Web can be interfaced with humans once again, using natural language for querying and answering rather than technical formalisms. Whereas current Question Answering systems typically select snippets from Web documents retrieved by a search engine, we utilize Semantic Web data, which allows us to provide natural-language answers that are tailored to the current dialog context. Furthermore, we show how to use natural language processing technologies to acquire new data and enrich existing data in a Semantic Web framework. Our system has acquired a rich biographic data resource by combining existing Semantic Web resources, which are discovered from semi-structured textual data in Web pages, with information extracted from free natural language texts.
LT World (www.lt-world.org) is an ontology-driven web portal aimed at serving the global language technology community. Ontology-driven means, that the system is driven by an ontological schema to manage the research information and knowledge life-cycles: identify relevant concepts of information, structure and formalize them, assign relationships, functions and views, add states and rules, modify them. For modelling such a complex structure, we employ (i) concepts from the research domain, such as person, organisation, project, tool, data, patent, news, event (ii) concepts from the LT domain, such as technology and resource (iii) concepts from closely related domains, such as language, linguistics, and mathematics. Whereas the research entities represent the general context, that is, a research environment as such, the LT entities define the information and knowledge space of the field, enhanced by entities from closely related areas. By managing information holistically ― that is, within a research context ― its inherent semantics becomes much more transparent. This paper introduces LT World as a reference information portal through ontological eyes: its content, its system, its method for maintaining knowledge-rich items, its ontology as an asset.
This paper presents a novel system HENNA (Hybrid Person Name Analyzer) for identifying language origin and analyzing linguistic structures of person names. We conduct ME-based classification methods for the language origin identification and achieve very promising performance. We will show that word-internal character sequences provide surprisingly strong evidence for predicting the language origin of person names. Our approach is context-, language- and domain-independent and can thus be easily adapted to person names in or from other languages. Furthermore, we provide a novel strategy to handle origin ambiguities or multiple origins in a name. HENNA also provides a person name parser for the analysis of linguistic and knowledge structures of person names. All the knowledge about a person name in HENNA is modelled in a person-name ontology, including relationships between language origins, linguistic features and grammars of person names of a specific language and interpretation of name elements. The approaches presented here are useful extensions of the named entity recognition task.
We describe methods for extracting interesting factual relations from scientific texts in computational linguistics and language technology taken from the ACL Anthology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting quriples are stored in a database from where they can be retrieved (again using abstraction methods) by relation-based search. The query interface is embedded in a web browser-based application we call the Scientists Workbench. It supports researchers in editing and online-searching scientific papers.
This paper presents various strategies for improving the extraction performance of less prominent relations with the help of the rules learned for similar relations, for which large volumes of data are available that exhibit suitable data properties. The rules are learned via a minimally supervised machine learning system for relation extraction called DARE. Starting from semantic seeds, DARE extracts linguistic grammar rules associated with semantic roles from parsed news texts. The performance analysis with respect to different experiment domains shows that the data property plays an important role for DARE. Especially the redundancy of the data and the connectivity of instances and pattern rules have a strong influence on recall. However, most real-world data sets do not possess the desirable small-world property. Therefore, we propose three scenarios to overcome the data property problem of some domains by exploiting a similar domain with better data properties. The first two strategies stay with the same corpus but try to extract new similar relations with learned rules. The third strategy adapts the learned rules to a new corpus. All three strategies show that frequently mentioned relations can help in the detection of less frequent relations.
We will describe new cross-lingual strategies for the development multilingual information services on mobile devices. The novelty of our approach is the intelligent modeling of cross-lingual application domains and the combination of textual translation with speech generation. The final system helps users to speak foreign languages and communicate with the local people in relevant situations, such as restaurant, taxi and emergencies. The advantage of our information services is that they are robust enough for the use in real-world situations. They are developed for the Beijing Olympic Games 2008, where most foreigners will have to rely on translation assistance. Their deployment is foreseen as part of the planned ubiquitous mobile information system of the Olympic Games.
Human translation is based on linguistic and extralinguistic knowledge. Despite promising pioneering advances, knowledge-based machine translation has remained a tempting vision. The bottleneck has been the engineering of sufficiently comprehensive bodies of relevant knowledge The Semantic Web offers opportunities for the gradual evolution of a global heterogeneous knowledge base. The immediate target has been the modelling of certain knowledge domains by practical ontologies. In the talk we will demonstrate the utilization of ontological knowledge indifferent crosslingual applications reaching from crosslingual document retrieval via crosslingual question answering to complex information services involving several crosslingual functionalities, including machine translation. We will then discuss the ramifications of this development and of the evolution of the World Wide Web for future directions in both statistical and rule-based machine translation.