Sergei Nirenburg

Also published as: Sergei Nirenberg, S. Nirenburg


2017

2016

Language-endowed intelligent agents benefit from leveraging lexical knowledge falling at different points along a spectrum of compositionality. This means that robust computational lexicons should include not only the compositional expectations of argument-taking words, but also non-compositional collocations (idioms), semi-compositional collocations that might be difficult for an agent to interpret (e.g., standard metaphors), and even collocations that could be compositionally analyzed but are so frequently encountered that recording their meaning increases the efficiency of interpretation. In this paper we argue that yet another type of string-to-meaning mapping can also be useful to intelligent agents: remembered semantic analyses of actual text inputs. These can be viewed as super-specific multi-word expressions whose recorded interpretations mimic a person’s memories of knowledge previously learned from language input. These differ from typical annotated corpora in two ways. First, they provide a full, context-sensitive semantic interpretation rather than select features. Second, they are are formulated in the ontologically-grounded metalanguage used in a particular agent environment, meaning that the interpretations contribute to the dynamically evolving cognitive capabilites of agents configured in that environment.

2008

2005

This paper describes one approach to document authoring and natural language generation being pursued by the Summer Institute of Linguistics in cooperation with the University of Maryland, Baltimore County. We will describe the tools provided for document authoring, including a glimpse at the underlying controlled language and the semantic representation of the textual meaning. We will also introduce The Bible Translator’s Assistant© (TBTA), which is used to elicit and enter target language data as well as perform the actual text generation process. We conclude with a discussion of the usefulness of this paradigm from a Bible translation perspective and suggest several ways in which this work will benefit the field of computational linguistics.

2004

2003

2002

2001

2000

1999

The paper describes an approach to developing an interactive MT system for translating technical texts on the example of translating patent claims between Russian and English. The approach conforms to the human-aided machine translation paradigm. The system is meant for a source language (SL) speaker who does not know the target language (TL). It consists of i) an analysis module which includes a submodule of interactive syntactic analysis of SL text and a submodule of fully automated morphological analysis, ii) an automatic module for transferring the lexical and partially syntactic content of SL text into a similar content of the TL text and iii) a fully automated TL text generation module which relies on knowledge about the legal format of TL patent claims. An interactive analysis module guides the user through a sequence of SL analysis procedures, as a result of which the system produces a set of internal knowledge structures which serve as input to the TL text generation. Both analysis and generation rely heavily on the analysis of the sublanguage of patent claims. The model has been developed for English and Russian as both SLs and TLs but is readily extensible to other languages.
In this paper we describe a lexical disambiguation algorithm based on a statistical language model we call maximum likelihood disambiguation. The maximum likelihood method depends solely on the target language. The model was trained on a corpus of American English newspaper texts. Its performance was tested using output from a transfer based translation system between Turkish and English. The method is source language independent, and can be used for systems translating from any language into English.

1998

1997

1996

1995

1994

1993

1992

1991

1990

1989

1988

1987

1986

1985

1984

1982