Yukie Nakao

2010

We describe a multilingual Open Source CALL game, CALL-SLT, which reuses speech translation technology developed using the Regulus platform to create an automatic conversation partner that allows intermediate-level language students to improve their fluency. We contrast CALL-SLT with Wang's and Seneff's ``translation game'' system, in particular focussing on three issues. First, we argue that the grammar-based recognition architecture offered by Regulus is more suitable for this type of application; second, that it is preferable to prompt the student in a language-neutral form, rather than in the L1; and third, that we can profitably record successful interactions by native speakers and store them to be reused as online help for students. The current system, which will be demoed at the conference, supports four L2s (English, French, Japanese and Swedish) and two L1s (English and French). We conclude by describing an evaluation exercise, where a version of CALL-SLT configured for English L2 and French L1 was used by several hundred high school students. About half of the subjects reported positive impressions of the system.

2009

pdf bib

Using Artificially Generated Data to Evaluate Statistical Machine Translation
Manny Rayner | Paula Estrella | Pierrette Bouillon | Beth Ann Hockey | Yukie Nakao
Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks (GEAF 2009)

pdf bib

Using Artificial Data to Compare the Difficulty of Using Statistical Machine Translation in Different Language-Pairs
Manny Rayner | Paula Estrella | Pierrette Bouillon | Yukie Nakao
Proceedings of Machine Translation Summit XII: Posters

2008

pdf bib abs

Many-to-Many Multilingual Medical Speech Translation on a PDA
Kyoko Kanzaki | Yukie Nakao | Manny Rayner | Marianne Santaholma | Marianne Starlander | Nikos Tsourakis
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Government and Commercial Uses of MT

Particularly considering the requirement of high reliability, we argue that the most appropriate architecture for a medical speech translator that can be realised using today’s technology combines unidirectional (doctor to patient) translation, medium-vocabulary controlled language coverage, interlingua-based translation, an embedded help component, and deployability on a hand-held hardware platform. We present an overview of the Open Source MedSLT prototype, which has been developed in accordance with these design principles. The system is implemented on top of the Regulus and Nuance 8.5 platforms, translates patient examination questions for all language pairs in the set {English, French, Japanese, Arabic, Catalan}, using vocabularies of about 400 to 1 100 words, and can be run in a distributed client/server environment, where the client application is hosted on a Nokia Internet Tablet device.

pdf bib

Almost Flat Functional Semantics for Speech Translation
Manny Rayner | Pierrette Bouillon | Beth Ann Hockey | Yukie Nakao
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib

pdf bib abs

We describe recent work on MedSLT, a medium-vocabulary interlingua-based medical speech translation system, focussing on issues that arise when handling languages of which the grammar engineer has little or no knowledge. We show how we can systematically create and maintain multiple forms of grammars, lexica and interlingual representations, with some versions being used by language informants, and some by grammar engineers. In particular, we describe the advantages of structuring the interlingua definition as a simple semantic grammar, which includes a human-readable surface form. We show how this allows us to rationalise the process of evaluating translations between languages lacking common speakers, and also makes it possible to create a simple generic tool for debugging to-interlingua translation rules. Examples presented focus on the concrete case of translation between Japanese and Arabic in both directions.

2006

pdf bib

pdf bib

Une grammaire partagée multitâche pour le traitement de la parole : application aux langues romanes [A multitask shared grammar for speech processing: application to romance languages]
Pierrette Bouillon | Manny Rayner | Bruna Novellas | Marianne Starlander | Marianne Santaholma | Yukie Nakao | Nikos Chatzichrisafis
Traitement Automatique des Langues, Volume 47, Numéro 3 : Varia [Varia]

pdf bib abs

Une grammaire multilingue partagée pour la traduction automatique de la parole
Pierrette Bouillon | Manny Rayner | Bruna Novellas | Yukie Nakao | Marianne Santaholma | Marianne Starlander | Nikos Chatzichrisafis
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Aujourd’hui, l’approche la plus courante en traitement de la parole consiste à combiner un reconnaisseur statistique avec un analyseur robuste. Pour beaucoup d’applications cependant, les reconnaisseurs linguistiques basés sur les grammaires offrent de nombreux avantages. Dans cet article, nous présentons une méthodologie et un ensemble de logiciels libres (appelé Regulus) pour dériver rapidement des reconnaisseurs linguistiquement motivés à partir d’une grammaire générale partagée pour le catalan et le français.

2005

pdf bib abs

Representational and architectural issues in a limited-domain medical speech translator
Manny Rayner | Pierrette Bouillon | Marianne Santaholma | Yukie Nakao
Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

We present an overview of MedSLT, a medium-vocabulary medical speech translation system, focussing on the representational issues that arise when translating temporal and causal concepts. Although flat key/value structures are strongly preferred as semantic representations in speech understanding systems, we argue that it is infeasible to handle the necessary range of concepts using only flat structures. By exploiting the specific nature of the task, we show that it is possible to implement a solution which only slightly extends the representational complexity of the semantic representation language, by permitting an optional single nested level representing a subordinate clause construct. We sketch our solutions to the key problems of producing minimally nested representations using phrase-spotting methods, and writing cleanly structured rule-sets that map temporal and phrasal representations into a canonical interlingual form.

pdf bib abs

In this paper, we present evidence that providing users of a speech to speech translation system for emergency diagnosis (MedSLT) with a tool that helps them to learn the coverage greatly improves their success in using the system. In MedSLT, the system uses a grammar-based recogniser that provides more predictable results to the translation component. The help module aims at addressing the lack of robustness inherent in this type of approach. It takes as input the result of a robust statistical recogniser that performs better for out-of-coverage data and produces a list of in-coverage example sentences. These examples are selected from a defined list using a heuristic that prioritises sentences maximising the number of N-grams shared with those extracted from the recognition result.

pdf bib