Tonio Wandmacher

2013

2011

pdf bib

The Quaero program is an international project promoting research and industrial innovation on technologies for automatic analysis and classification of multimedia and multilingual documents. Within the program framework, research organizations and industrial partners collaborate to develop prototypes of innovating applications and services for access and usage of multimedia data. One of the topics addressed is the translation of spoken language. Each year, a project-internal evaluation is conducted by DGA to monitor the technological advances. This work describes the design and results of the 2011 evaluation campaign. The participating partners were RWTH, KIT, LIMSI and SYSTRAN. Their approaches are compared on both ASR output and reference transcripts of speech data for the translation between French and German. The results show that the developed techniques further the state of the art and improve translation quality.

2009

pdf bib

Automatic Acquisition of the Argument-Predicate Relations from a Frame-Annotated Corpus
Ekaterina Ovchinnikova | Theodore Alexandrov | Tonio Wandmacher
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2007

pdf bib

Methods to Integrate a Language Model with Semantic Information for a Word Prediction Component
Tonio Wandmacher | Jean-Yves Antoine
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib

Modèle adaptatif pour la prédiction de mots. Adaptation à l’utilisateur et au contexte dans le cadre de la communication assistée pour personnes handicapées [Adaptive model for word prediction. Adaptation to user and context in assistive communication for people with disabilities]
Tonio Wandmacher | Jean-Yves Antoine
Traitement Automatique des Langues, Volume 48, Numéro 2 : Communication Assistée [Assisted communication]

2006

pdf bib abs

Adaptation de modèles de langage à l’utilisateur et au registre de langage : expérimentations dans le domaine de l’aide au handicap
Tonio Wandmacher | Jean-Yves Antoine
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

Les modèles markoviens de langage sont très dépendants des données d’entraînement sur lesquels ils sont appris. Cette dépendance, qui rend difficile l’interprétation des performances, a surtout un fort impact sur l’adaptation à chaque utilisateur de ces modèles. Cette question a déjà été largement étudiée par le passé. En nous appuyant sur un domaine d’application spécifique (prédiction de texte pour l’aide à la communication pour personnes handicapées), nous voudrions l’étendre à la problématique de l’influence du registre de langage. En considérant des corpus relevant de cinq genres différents, nous avons étudié la réduction de cette influence par trois modèles adaptatifs différents : (a) un modèle cache classique favorisant les n derniers mots rencontrés, (b) l’intégration au modèle d’un dictionnaire dynamique de l’utilisateur et enfin (c) un modèle de langage interpolé combinant un modèle général et un modèle utilisateur mis à jour dynamiquement au fil des saisies. Cette évaluation porte un système de prédiction de texte basé sur un modèle trigramme.

pdf bib abs

Training Language Models without Appropriate Language Resources: Experiments with an AAC System for Disabled People
Tonio Wandmacher | Jean-Yves Antoine
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Statistical Language Models (LM) are highly dependent on their training resources. This makes it not only difficult to interpret evaluation results, it also has a deteriorating effect on the use of an LM-based application. This question has already been studied by others. Considering a specific domain (text prediction in a communication aid for handicapped people) we want to address the problem from a different point of view: the influence of the language register. Considering corpora from five different registers, we want to discuss three methods to adapt a language model to its actual language resource ultimately reducing the effect of training dependency: (a) A simple cache model augmenting the probability of the n last inserted words; (b) a user dictionary, keeping every unseen word; and (c) a combined LM interpolating a base model with a dynamically updated user model. Our evaluation is based on the results obtained from a text prediction system working on a trigram LM.

2005

pdf bib abs

How semantic is Latent Semantic Analysis?
Tonio Wandmacher
Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues

In the past decade, Latent Semantic Analysis (LSA) was used in many NLP approaches with sometimes remarkable success. However, its abilities to express semantic relatedness were not yet systematically investigated. This is the aim of our work, where LSA is applied to a general text corpus (German newspaper), and for a test vocabulary, the lexical relations between a test word and its closest neighbours are analysed. These results are compared to the results from a collocation analysis.

Venues

LREC1

TAL1

Fix author