Marisa Jiménez
2003
Using decision trees to learn lexical information in a linguistics-based NLP system
Marisa Jiménez
|
Martine Pettenaro
Actes de la 10ème conférence sur le Traitement Automatique des Langues Naturelles. Posters
This paper describes the use of decision trees to learn lexical information for the enrichment of our natural language processing (NLP) system. Our approach to lexical learning differs from other approaches in the field in that our machine learning techniques exploit a deep knowledge understanding system. After the introduction we present the overall architecture of our lexical learning module. In the following sections we present a showcase of lexical learning using decision trees: we learn verbs that take a human subject in Spanish and French.
2001
Generation of named entities
Marisa Jiménez
Proceedings of Machine Translation Summit VIII
In this paper we present an overview of an approach developed at Microsoft Research to generate strings for named entities such as places and dates. This approach uses abstract representations as input. We first provide an overview of our system to identify named entities in text. Next we present our approach to generate these entities from abstract representations, known as “logical forms” in our system. We then focus on the generation of place names in Spanish. We discuss our technique to generate Spanish place names from a logical form where language-specific features, such as word order, or capitalization conventions do not exist. We finally present the details of a study that we carried out to help us make sound linguistic decisions in the generation of place names in Spanish.