2008
pdf
bib
abs
Arabic WordNet: Semi-automatic Extensions using Bayesian Inference
Horacio Rodríguez
|
David Farwell
|
Javi Ferreres
|
Manuel Bertran
|
Musa Alkhalifa
|
M. Antonia Martí
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
This presentation focuses on the semi-automatic extension of Arabic WordNet (AWN) using lexical and morphological rules and applying Bayesian inference. We briefly report on the current status of AWN and propose a way of extending its coverage by taking advantage of a limited set of highly productive Arabic morphological rules for deriving a range of semantically related word forms from verb entries. The application of this set of rules, combined with the use of bilingual Arabic-English resources and Princetons WordNet, allows the generation of a graph representing the semantic neighbourhood of the original word. In previous work, a set of associations between the hypothesized Arabic words and English synsets was proposed on the basis of this graph. Here, a novel approach to extending AWN is presented whereby a Bayesian Network is automatically built from the graph and then the net is used as an inferencing mechanism for scoring the set of candidate associations. Both on its own and in combination with the previous technique, this new approach has led to improved results.
2006
pdf
bib
abs
Parallel Syntactic Annotation of Multiple Languages
Owen Rambow
|
Bonnie Dorr
|
David Farwell
|
Rebecca Green
|
Nizar Habash
|
Stephen Helmreich
|
Eduard Hovy
|
Lori Levin
|
Keith J. Miller
|
Teruko Mitamura
|
Florence Reeder
|
Advaith Siddharthan
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
This paper describes an effort to investigate the incrementally deepening development of an interlingua notation, validated by human annotation of texts in English plus six languages. We begin with deep syntactic annotation, and in this paper present a series of annotation manuals for six different languages at the deep-syntactic level of representation. Many syntactic differences between languages are removed in the proposed syntactic annotation, making them useful resources for multilingual NLP projects with semantic components.
pdf
bib
Pragmatics-based MT and the Translation of Puns
David Farwell
|
Stephen Helmreich
Proceedings of the 11th Annual Conference of the European Association for Machine Translation
pdf
bib
abs
Arabic WordNet and the Challenges of Arabic
Sabri Elkateb
|
William Black
|
Piek Vossen
|
David Farwell
|
Horacio Rodríguez
|
Adam Pease
|
Musa Alkhalifa
|
Christiane Fellbaum
Proceedings of the International Conference on the Challenge of Arabic for NLP/MT
Arabic WordNet is a lexical resource for Modern Standard Arabic based on the widely used Princeton WordNet for English (Fellbaum, 1998). Arabic WordNet (AWN) is based on the design and contents of the universally accepted Princeton WordNet (PWN) and will be mappable straightforwardly onto PWN 2.0 and EuroWordNet (EWN), enabling translation on the lexical level to English and dozens of other languages. We have developed and linked the AWN with the Suggested Upper Merged Ontology (SUMO), where concepts are defined with machine interpretable semantics in first order logic (Niles and Pease, 2001). We have greatly extended the ontology and its set of mappings to provide formal terms and definitions for each synset. The end product would be a linguistic resource with a deep formal semantic foundation that is able to capture the richness of Arabic as described in Elkateb (2005). Tools we have developed as part of this effort include a lexicographer's interface modeled on that used for EuroWordNet, with added facilities for Arabic script, following Black and Elkateb's earlier work (2004). In this paper we describe our methodology for building a lexical resource in Arabic and the challenge of Arabic for lexical resources.
2005
pdf
bib
abs
The FAME Speech-to-Speech Translation System for Catalan, English, and Spanish
Victoria Arranz
|
Elisabet Comelles
|
David Farwell
Proceedings of Machine Translation Summit X: Papers
This paper describes the evaluation of the FAME interlingua-based speech-to-speech translation system for Catalan, English and Spanish. This system is an extension of the already existing NESPOLE! that translates between English, French, German and Italian. This article begins with a brief introduction followed by a description of the system architecture and the components of the translation module including the Speech Recognizer, the analysis chain, the generation chain and the Speech Synthesizer. Then we explain the interlingua formalism used, called Interchange Format (IF). We show the results obtained from the evaluation of the system and we describe the three types of evaluation done. We also compare the results of our system with those obtained by a stochastic translator which has been independently developed over the course of the FAME project. Finally, we conclude with future work.
2004
pdf
bib
Interlingual Annotation of Multilingual Text Corpora
Stephen Helmreich
|
David Farwell
|
Bonnie Dorr
|
Nizar Habash
|
Lori Levin
|
Teruko Mitamura
|
Florence Reeder
|
Keith Miller
|
Eduard Hovy
|
Owen Rambow
|
Advaith Siddharthan
Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004
pdf
bib
abs
A speech-to-speech translation system for Catalan, Spanish, and English
Victoria Arranz
|
Elisabet Comelles
|
David Farwell
|
Climent Nadeu
|
Jaume Padrell
|
Albert Febrer
|
Dorcas Alexander
|
Kay Peterson
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers
In this paper we describe the FAME interlingual speech-to- speech translation System for Spanish, Catalan and English which is intended to assist users in the reservation of a hotel room when calling or visiting abroad. The System has been developed as an extension of the existing NESPOLE! translation system [4] which translates between English, German, Italian and French. After a brief introduction we describe the Spanish and Catalan System components including speech recognition, transcription to IF mapping, IF to text generation and speech synthesis. We also present a task-oriented evaluation method used to inform about system development and some preliminary results.
pdf
bib
abs
Counting, measuring, ordering: translation problems and solutions
Stephen Helmreich
|
David Farwell
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers
This paper describes some difficulties associated with the translation of numbers (scalars) used for counting, measuring, or selecting items or properties. A set of problematic issues is described, and the presence of these difficulties is quantified by examining a set of texts and translations. An approach to a solution is suggested.
pdf
bib
abs
Interlingual annotation for MT development
Florence Reeder
|
Bonnie Dorr
|
David Farwell
|
Nizar Habash
|
Stephen Helmreich
|
Eduard Hovy
|
Lori Levin
|
Teruko Mitamura
|
Keith Miller
|
Owen Rambow
|
Advaith Siddharthan
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers
MT systems that use only superficial representations, including the current generation of statistical MT systems, have been successful and useful. However, they will experience a plateau in quality, much like other “silver bullet” approaches to MT. We pursue work on the development of interlingual representations for use in symbolic or hybrid MT systems. In this paper, we describe the creation of an interlingua and the development of a corpus of semantically annotated text, to be validated in six languages and evaluated in several ways. We have established a distributed, well-functioning research methodology, designed a preliminary interlingua notation, created annotation manuals and tools, developed a test collection in six languages with associated English translations, annotated some 150 translations, and designed and applied various annotation metrics. We describe the data sets being annotated and the interlingual (IL) representation language which uses two ontologies and a systematic theta-role list. We present the annotation tools built and outline the annotation process. Following this, we describe our evaluation methodology and conclude with a summary of issues that have arisen.
2003
pdf
bib
abs
Pragmatics-based translation and MT evaluation
David Farwell
|
Stephen Helmreich
Workshop on Systemizing MT Evaluation
In this paper the authors wish to present a view of translation equivalence related to a pragmatics-based approach to machine translation. We will argue that current evaluation methods which assume that there is a predictable correspondence between language forms cannot adequately account for this view. We will then describe a method for objectively determining the relative equivalence of two texts. However, given the need for both an open world assumption and non-monotonic inferencing, such a method cannot be realistically implemented and therefore certain "classic" evaluation strategies will continue to be preferable as practical methods of evaluation.
2001
pdf
bib
abs
Towards pragmatics-based machine translation
David Farwell
|
Stephen Helmreich
Workshop on MT2010: Towards a Road Map for MT
We propose a program of research which has as its goal establishing a framework and methodology for investigating the pragmatic aspects of the translation process and implementing a computational platform for carrying out systematic experiments on the pragmatics of translation. The program has four components. First, on the basis of a comparative study of multiple translations of the same document into a single target language, a pragmatics-based computational model is to be developed in which reasoning about the beliefs of the participants in the translation task and about the content of a text are central. Second, existing Natural Language Processing technologies are to be appraised as potential components of a computational platform that supports investigations into the effects of pragmatics on translation. Third, the platform is to be assembled and prototype translation systems implemented which conform to the pragmatics-based computational model of translation. Finally, a novel evaluation methodology is to be developed and evaluations of the systems carried out.
2000
pdf
bib
abs
Text meaning representation as a basis for representation of text interpretation
Stephen Helmreich
|
David Farwell
Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas: Technical Papers
In this paper we propose a representation for what we have called an interpretation of a text. We base this representation on TMR (Text Meaning Representation), an interlingual representation developed for Machine Translation purposes. A TMR consists of a complex feature-value structure, with the feature names and filler values drawn from an ontology, in this case, ONTOS, developed concurrently with TMR. We suggest on the basis of previous work, that a representation of an interpretation of a text must build on a TMR structure for the text in several ways: (1) by the inclusion of additional required features and feature values (which may themselves be complex feature structures); (2) by pragmatically filling in empty slots in the TMR structure itself; and (3) by supporting the connections between feature values by including, as part of the TMR itself, the chains of inferencing that link various parts of the structure.
pdf
bib
NLP system oriented to anaphora resolution
Maximiliano Saiz-Noeda
|
Manual Palomar
|
David Farwell
Proceedings of the International Conference on Machine Translation and Multilingual Applications in the new Millennium: MT 2000
pdf
bib
An Interlingual-based Approach to Reference Resolution
David Farwell
NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP
1998
bib
Breaking the quality ceiling
David Farwell
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Panel Descriptions
1997
pdf
bib
abs
User-Friendly Machine Translation: Alternate Translations Based on Differing Beliefs
David Farwell
|
Stephen Helmreich
Proceedings of Machine Translation Summit VI: Papers
In this paper the authors present a notion of “user-friendly” translation and describe a method for achieving it within a pragmatics-based approach to machine translation. The approach relies on modeling the beliefs of the participants in the translation process: the source language speaker and addressee, the translator and the target language addressee. Translation choices may vary according to how beliefs are ascribed to the various participants and, in particular, “user-friendly” choices are based on the beliefs ascribed to the TL addressee.
pdf
bib
On representing language-specific information in interlingua
David Farwell
AMTA/SIG-IL First Workshop on Interlinguas
1996
pdf
bib
Lexical Rules is Italicized
Stephen Helmreich
|
David Farwell
Breadth and Depth of Semantic Lexicons
pdf
bib
Translation differences and pragmatics-based MT
Stephen Helmreich
|
David Farwell
Conference of the Association for Machine Translation in the Americas
pdf
bib
Panel: Next steps in MT research
Lynn Carlson
|
Jaime Carbonell
|
David Farwell
|
Pierre Isabelle
|
Jackie Murgida
|
John O’Hara
|
Dekai Wu
Conference of the Association for Machine Translation in the Americas
1994
pdf
bib
Pangloss: A Knowledge-based Machine Assisted Translation Research Project - Site 2
D. Farwell
Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994
pdf
bib
Two Types of Adaptive MT Environments
Sergei Nirenburg
|
Robert Frederking
|
David Farwell
|
Yorick Wilks
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics
pdf
bib
PANGLYZER: Spanish Language Analysis System
David Farwell
|
Steven Helmreich
|
Wanying Jin
|
Mark Casper
|
Jim Hargrave
|
Hugo Molina-Salgado
|
Fuliang Weng
Proceedings of the First Conference of the Association for Machine Translation in the Americas
pdf
bib
Integrating Translations from Multiple Sources within the PANGLOSS Mark III Machine Translation System
Robert Frederking
|
Sergei Nirenburg
|
David Farwell
|
Steven Helmreich
|
Eduard Hovy
|
Kevin Knight
|
Stephen Beale
|
Constantino Domashnev
|
Donalee Attardo
|
Dean Grannes
|
Ralf Brown
Proceedings of the First Conference of the Association for Machine Translation in the Americas
pdf
bib
PANGLOSS
Jaime Carbonell
|
David Farwell
|
Robert Frederking
|
Steven Helmreich
|
Eduard Hovy
|
Kevin Knight
|
Lori Levin
|
Sergei Nirenburg
Proceedings of the First Conference of the Association for Machine Translation in the Americas
1992
pdf
bib
The Automatic Creation of Lexical Entries for a Multilingual MT System
David Farwell
|
Louise Guthrie
|
Yorick Wilks
COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics
1991
pdf
bib
abs
ULTRA: A Multi-lingual Machine Translator
David Farwell
|
Yorick Wilks
Proceedings of Machine Translation Summit III: Papers
ULTRA (Universal Language TRAnslator) is a multilingual, interlingual machine translation system currently under development at the Computing Research Laboratory at New Mexico State University. It translates between five languages (Chinese, English, German, Japanese, Spanish) with vocabularies in each language based on approximately 10,000 word senses. The major design criteria are that the system be robust and general purpose with simple to use utilities for customization to suit the needs of particular users. This paper describes the central characteristics of the system: the intermediate representation, the language components, semantic and pragmatic processes, and supporting lexical entry tools.
1990
pdf
bib
Machine Translation Again?
Yorick Wilks
|
Jaime Carbonell
|
David Farwell
|
Eduard Hovy
|
Sergei Nirenburg
Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990
1989
pdf
bib
New Mexico State University Computing Research Laboratory
Yorick Wilks
|
David Farwell
|
Afzal Ballim
|
Roger Hartley
Speech and Natural Language: Proceedings of a Workshop Held at Philadelphia, Pennsylvania, February 21-23, 1989