Alison Alvarez


pdf bib
Linguistic Structure and Bilingual Informants Help Induce Machine Translation of Lesser-Resourced Languages
Christian Monson | Ariadna Font Llitjós | Vamshi Ambati | Lori Levin | Alon Lavie | Alison Alvarez | Roberto Aranovich | Jaime Carbonell | Robert Frederking | Erik Peterson | Katharina Probst
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Producing machine translation (MT) for the many minority languages in the world is a serious challenge. Minority languages typically have few resources for building MT systems. For many minor languages there is little machine readable text, few knowledgeable linguists, and little money available for MT development. For these reasons, our research programs on minority language MT have focused on leveraging to the maximum extent two resources that are available for minority languages: linguistic structure and bilingual informants. All natural languages contain linguistic structure. And although the details of that linguistic structure vary from language to language, language universals such as context-free syntactic structure and the paradigmatic structure of inflectional morphology, allow us to learn the specific details of a minority language. Similarly, most minority languages possess speakers who are bilingual with the major language of the area. This paper discusses our efforts to utilize linguistic structure and the translation information that bilingual informants can provide in three sub-areas of our rapid development MT program: morphology induction, syntactic transfer rule learning, and refinement of imperfect learned rules.


pdf bib
An assessment of language elicitation without the supervision of a linguist
Alison Alvarez | Lori Levin | Robert Frederking | Jill Lehman
Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers


pdf bib
The MILE Corpus for Less Commonly Taught Languages
Alison Alvarez | Lori Levin | Robert Frederking | Simon Fung | Donna Gates | Jeff Good
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers


pdf bib
Semi-Automated Elicitation Corpus Generation
Alison Alvarez | Lori Levin | Robert Frederking | Erik Peterson | Jeff Good
Proceedings of Machine Translation Summit X: Posters

In this document we will describe a semi-automated process for creating elicitation corpora. An elicitation corpus is translated by a bilingual consultant in order to produce high quality word aligned sentence pairs. The corpus sentences are automatically generated from detailed feature structures using the GenKit generation program. Feature structures themselves are automatically generated from information that is provided by a linguist using our corpus specification software. This helps us to build small, flexible corpora for testing and development of machine translation systems.