Nancy Ide

Also published as: Nancy M. Ide


2020

pdf bib
Infrastructure for Semantic Annotation in the Genomics Domain
Mahmoud El-Haj | Nathan Rutherford | Matthew Coole | Ignatius Ezeani | Sheryl Prentice | Nancy Ide | Jo Knight | Scott Piao | John Mariani | Paul Rayson | Keith Suderman
Proceedings of the 12th Language Resources and Evaluation Conference

We describe a novel super-infrastructure for biomedical text mining which incorporates an end-to-end pipeline for the collection, annotation, storage, retrieval and analysis of biomedical and life sciences literature, combining NLP and corpus linguistics methods. The infrastructure permits extreme-scale research on the open access PubMed Central archive. It combines an updatable Gene Ontology Semantic Tagger (GOST) for entity identification and semantic markup in the literature, with a NLP pipeline scheduler (Buster) to collect and process the corpus, and a bespoke columnar corpus database (LexiDB) for indexing. The corpus database is distributed to permit fast indexing, and provides a simple web front-end with corpus linguistics methods for sub-corpus comparison and retrieval. GOST is also connected as a service in the Language Application (LAPPS) Grid, in which context it is interoperable with other NLP tools and data in the Grid and can be combined with them in more complex workflows. In a literature based discovery setting, we have created an annotated corpus of 9,776 papers with 5,481,543 words.

pdf bib
Interchange Formats for Visualization: LIF and MMIF
Kyeongmin Rim | Kelley Lynch | Marc Verhagen | Nancy Ide | James Pustejovsky
Proceedings of the 12th Language Resources and Evaluation Conference

Promoting interoperrable computational linguistics (CL) and natural language processing (NLP) application platforms and interchange-able data formats have contributed improving discoverabilty and accessbility of the openly available NLP software. In this paper, wediscuss the enhanced data visualization capabilities that are also enabled by inter-operating NLP pipelines and interchange formats.For adding openly available visualization tools and graphical annotation tools to the Language Applications Grid (LAPPS Grid) andComputational Linguistics Applications for Multimedia Services (CLAMS) toolboxes, we have developed interchange formats that cancarry annotations and metadata for text and audiovisual source data. We descibe those data formats and present case studies where wesuccessfully adopt open-source visualization tools and combine them with CL tools.

pdf bib
AskMe: A LAPPS Grid-based NLP Query and Retrieval System for Covid-19 Literature
Keith Suderman | Nancy Ide | Verhagen Marc | Brent Cochran | James Pustejovsky
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

In a recent project, the Language Application Grid was augmented to support the mining of scientific publications. The results of that ef- fort have now been repurposed to focus on Covid-19 literature, including modification of the LAPPS Grid “AskMe” query and retrieval engine. We describe the AskMe system and discuss its functionality as compared to other query engines available to search covid-related publications.

pdf bib
Towards Standardization of Web Service Protocols for NLPaaS
Jin-Dong Kim | Nancy Ide | Keith Suderman
Proceedings of the 1st International Workshop on Language Technology Platforms

Several web services for various natural language processing (NLP) tasks (‘‘NLP-as-a-service” or NLPaaS) have recently been made publicly available. However, despite their similar functionality these services often differ in the protocols they use, thus complicating the development of clients accessing them. A survey of currently available NLPaaS services suggests that it may be possible to identify a minimal application layer protocol that can be shared by NLPaaS services without sacrificing functionality or convenience, while at the same time simplifying the development of clients for these services. In this paper, we hope to raise awareness of the interoperability problems caused by the variety of existing web service protocols, and describe an effort to identify a set of best practices for NLPaaS protocol design. To that end, we survey and compare protocols used by NLPaaS services and suggest how these protocols may be further aligned to reduce variation.

2019

pdf bib
A Multi-Platform Annotation Ecosystem for Domain Adaptation
Richard Eckart de Castilho | Nancy Ide | Jin-Dong Kim | Jan-Christoph Klie | Keith Suderman
Proceedings of the 13th Linguistic Annotation Workshop

This paper describes an ecosystem consisting of three independent text annotation platforms. To demonstrate their ability to work in concert, we illustrate how to use them to address an interactive domain adaptation task in biomedical entity recognition. The platforms and the approach are in general domain-independent and can be readily applied to other areas of science.

2018

pdf bib
Three Dimensions of Reproducibility in Natural Language Processing
K. Bretonnel Cohen | Jingbo Xia | Pierre Zweigenbaum | Tiffany Callahan | Orin Hargraves | Foster Goss | Nancy Ide | Aurélie Névéol | Cyril Grouin | Lawrence E. Hunter
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Bridging the LAPPS Grid and CLARIN
Erhard Hinrichs | Nancy Ide | James Pustejovsky | Jan Hajič | Marie Hinrichs | Mohammad Fazleh Elahi | Keith Suderman | Marc Verhagen | Kyeongmin Rim | Pavel Straňák | Jozef Mišutka
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Mining Biomedical Publications With The LAPPS Grid
Nancy Ide | Keith Suderman | Jin-Dong Kim
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs
Richard Eckart de Castilho | Nancy Ide | Emanuele Lapponi | Stephan Oepen | Keith Suderman | Erik Velldal | Marc Verhagen
Proceedings of the 11th Linguistic Annotation Workshop

For decades, most self-respecting linguistic engineering initiatives have designed and implemented custom representations for various layers of, for example, morphological, syntactic, and semantic analysis. Despite occasional efforts at harmonization or even standardization, our field today is blessed with a multitude of ways of encoding and exchanging linguistic annotations of these types, both at the levels of ‘abstract syntax’, naming choices, and of course file formats. To a large degree, it is possible to work within and across design plurality by conversion, and often there may be good reasons for divergent design reflecting differences in use. However, it is likely that some abstract commonalities across choices of representation are obscured by more superficial differences, and conversely there is no obvious procedure to tease apart what actually constitute contentful vs. mere technical divergences. In this study, we seek to conceptually align three representations for common types of morpho-syntactic analysis, pinpoint what in our view constitute contentful differences, and reflect on the underlying principles and specific requirements that led to individual choices. We expect that a more in-depth understanding of these choices across designs may led to increased harmonization, or at least to more informed design of future representations.

pdf bib
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)
Nancy Ide | Aurélie Herbelot | Lluís Màrquez
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

2016

pdf bib
The Language Application Grid and Galaxy
Nancy Ide | Keith Suderman | James Pustejovsky | Marc Verhagen | Christopher Cieri
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The NSF-SI2-funded LAPPS Grid project is a collaborative effort among Brandeis University, Vassar College, Carnegie-Mellon University (CMU), and the Linguistic Data Consortium (LDC), which has developed an open, web-based infrastructure through which resources can be easily accessed and within which tailored language services can be efficiently composed, evaluated, disseminated and consumed by researchers, developers, and students across a wide variety of disciplines. The LAPPS Grid project recently adopted Galaxy (Giardine et al., 2005), a robust, well-developed, and well-supported front end for workflow configuration, management, and persistence. Galaxy allows data inputs and processing steps to be selected from graphical menus, and results are displayed in intuitive plots and summaries that encourage interactive workflows and the exploration of hypotheses. The Galaxy workflow engine provides significant advantages for deploying pipelines of LAPPS Grid web services, including not only means to create and deploy locally-run and even customized versions of the LAPPS Grid as well as running the LAPPS Grid in the cloud, but also access to a huge array of statistical and visualization tools that have been developed for use in genomics research.

pdf bib
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)
Yohei Murakami | Donghui Lin | Nancy Ide | James Pustejovsky
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)

pdf bib
LAPPS/Galaxy: Current State and Next Steps
Nancy Ide | Keith Suderman | Eric Nyberg | James Pustejovsky | Marc Verhagen
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)

The US National Science Foundation (NSF) SI2-funded LAPPS/Galaxy project has developed an open-source platform for enabling complex analyses while hiding complexities associated with underlying infrastructure, that can be accessed through a web interface, deployed on any Unix system, or run from the cloud. It provides sophisticated tool integration and history capabilities, a workflow system for building automated multi-step analyses, state-of-the-art evaluation capabilities, and facilities for sharing and publishing analyses. This paper describes the current facilities available in LAPPS/Galaxy and outlines the project’s ongoing activities to enhance the framework.

2015

pdf bib
Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications
Christian Chiarcos | John Philip McCrae | Petya Osenova | Philipp Cimiano | Nancy Ide
Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications

2014

pdf bib
FrameNet and Linked Data
Nancy Ide
Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929-2014)

pdf bib
Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT
Nancy Ide | Jens Grivolla
Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT

pdf bib
The Language Application Grid Web Service Exchange Vocabulary
Nancy Ide | James Pustejovsky | Keith Suderman | Marc Verhagen
Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT

pdf bib
Biber Redux: Reconsidering Dimensions of Variation in American English
Rebecca J. Passonneau | Nancy Ide | Songqiao Su | Jesse Stuart
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
The Language Application Grid
Nancy Ide | James Pustejovsky | Christopher Cieri | Eric Nyberg | Di Wang | Keith Suderman | Marc Verhagen | Jonathan Wright
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The Language Application (LAPPS) Grid project is establishing a framework that enables language service discovery, composition, and reuse and promotes sustainability, manageability, usability, and interoperability of natural language Processing (NLP) components. It is based on the service-oriented architecture (SOA), a more recent, web-oriented version of the “pipeline” architecture that has long been used in NLP for sequencing loosely-coupled linguistic analyses. The LAPPS Grid provides access to basic NLP processing tools and resources and enables pipelining such tools to create custom NLP applications, as well as composite services such as question answering and machine translation together with language resources such as mono- and multi-lingual corpora and lexicons that support NLP. The transformative aspect of the LAPPS Grid is that it orchestrates access to and deployment of language resources and processing functions available from servers around the globe and enables users to add their own language resources, services, and even service grids to satisfy their particular needs.

2013

pdf bib
Importing MASC into the ANNIS linguistic database: A case study of mapping GrAF
Arne Neumann | Nancy Ide | Manfred Stede
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse

2012

pdf bib
Proceedings of the Sixth Linguistic Annotation Workshop
Nancy Ide | Fei Xia
Proceedings of the Sixth Linguistic Annotation Workshop

pdf bib
A Model for Linguistic Resource Description
Nancy Ide | Keith Suderman
Proceedings of the Sixth Linguistic Annotation Workshop

pdf bib
The MASC Word Sense Corpus
Rebecca J. Passonneau | Collin F. Baker | Christiane Fellbaum | Nancy Ide
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The MASC project has produced a multi-genre corpus with multiple layers of linguistic annotation, together with a sentence corpus containing WordNet 3.1 sense tags for 1000 occurrences of each of 100 words produced by multiple annotators, accompanied by indepth inter-annotator agreement data. Here we give an overview of the contents of MASC and then focus on the word sense sentence corpus, describing the characteristics that differentiate it from other word sense corpora and detailing the inter-annotator agreement studies that have been performed on the annotations. Finally, we discuss the potential to grow the word sense sentence corpus through crowdsourcing and the plan to enhance the content and annotations of MASC through a community-based collaborative effort.

pdf bib
Empirical Comparisons of MASC Word Sense Annotations
Gerard de Melo | Collin F. Baker | Nancy Ide | Rebecca J. Passonneau | Christiane Fellbaum
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We analyze how different conceptions of lexical semantics affect sense annotations and how multiple sense inventories can be compared empirically, based on annotated text. Our study focuses on the MASC project, where data has been annotated using WordNet sense identifiers on the one hand, and FrameNet lexical units on the other. This allows us to compare the sense inventories of these lexical resources empirically rather than just theoretically, based on their glosses, leading to new insights. In particular, we compute contingency matrices and develop a novel measure, the Expected Jaccard Index, that quantifies the agreement between annotations of the same data based on two different resources even when they have different sets of categories.

2011

pdf bib
Proceedings of the 5th Linguistic Annotation Workshop
Nancy Ide | Adam Meyers | Sameer Pradhan | Katrin Tomanek
Proceedings of the 5th Linguistic Annotation Workshop

2010

pdf bib
Anveshan: A Framework for Analysis of Multiple Annotators’ Labeling Behavior
Vikas Bhardwaj | Rebecca Passonneau | Ansaf Salleb-Aouissi | Nancy Ide
Proceedings of the Fourth Linguistic Annotation Workshop

pdf bib
Anatomy of Annotation Schemes: Mapping to GrAF
Nancy Ide | Harry Bunt
Proceedings of the Fourth Linguistic Annotation Workshop

pdf bib
The Manually Annotated Sub-Corpus: A Community Resource for and by the People
Nancy Ide | Collin Baker | Christiane Fellbaum | Rebecca Passonneau
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
ANC2Go: A Web Application for Customized Corpus Creation
Nancy Ide | Keith Suderman | Brian Simms
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We describe a web application called “ANC2Go” that enables the user to select data from the Open American National Corpus (OANC) and the Manually Annotated Sub-corpus (MASC) together with some or all of the annotations available. The user also may select from among a variety of options for output format, or may receive the selected portions of the corpus and annotations in their original GrAF XML standoff format.. The request is processed by merging the annotations selected and rendering them in the desired output format, then bundling the results and making it available for download. Thus, users can create a customized corpus with data and annotations of their choosing, delivered in the format that is most convenient for their use. ANC2Go will be released as a web service in the near future. Both the OANC and MASC are freely available for any use from the American National Corpus website and may be accessed through the ANC2Go application, or they may downloaded in their entirety.

pdf bib
Word Sense Annotation of Polysemous Words by Multiple Annotators
Rebecca J. Passonneau | Ansaf Salleb-Aoussi | Vikas Bhardwaj | Nancy Ide
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We describe results of a word sense annotation task using WordNet, involving half a dozen well-trained annotators on ten polysemous words for three parts of speech. One hundred sentences for each word were annotated. Annotators had the same level of training and experience, but interannotator agreement (IA) varied across words. There was some effect of part of speech, with higher agreement on nouns and adjectives, but within the words for each part of speech there was wide variation. This variation in IA does not correlate with number of senses in the inventory, or the number of senses actually selected by annotators. In fact, IA was sometimes quite high for words with many senses. We claim that the IA variation is due to the word meanings, contexts of use, and individual differences among annotators. We find some correlation of IA with sense confusability as measured by a sense confusion threshhold (CT). Data mining for association rules on a flattened data representation indicating each annotator's sense choices identifies outliers for some words, and systematic differences among pairs of annotators on others.

pdf bib
A Road Map for Interoperable Language Resource Metadata
Christopher Cieri | Khalid Choukri | Nicoletta Calzolari | D. Terence Langendoen | Johannes Leveling | Martha Palmer | Nancy Ide | James Pustejovsky
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

LRs remain expensive to create and thus rare relative to demand across languages and technology types. The accidental re-creation of an LR that already exists is a nearly unforgivable waste of scarce resources that is unfortunately not so easy to avoid. The number of catalogs the HLT researcher must search, with their different formats, make it possible to overlook an existing resource. This paper sketches the sources of this problem and outlines a proposal to rectify along with a new vision of LR cataloging that will to facilitates the documentation and exploitation of a much wider range of LRs than previously considered.

2009

pdf bib
Latin Etymologies as Features on BNC Text Categorization
Alex Chengyu Fang | Wanyin Li | Nancy Ide
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf bib
Making Sense of Word Sense Variation
Rebecca Passonneau | Ansaf Salleb-Aouissi | Nancy Ide
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf bib
Proceedings of the Third Linguistic Annotation Workshop (LAW III)
Manfred Stede | Chu-Ren Huang | Nancy Ide | Adam Meyers
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf bib
Bridging the Gaps: Interoperability for GrAF, GATE, and UIMA
Nancy Ide | Keith Suderman
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf bib
The SILT and FlaReNet International Collaboration for Interoperability
Nancy Ide | James Pustejovsky | Nicoletta Calzolari | Claudia Soria
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

2008

pdf bib
MASC: the Manually Annotated Sub-Corpus of American English
Nancy Ide | Collin Baker | Christiane Fellbaum | Charles Fillmore | Rebecca Passonneau
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

To answer the critical need for sharable, reusable annotated resources with rich linguistic annotations, we are developing a Manually Annotated Sub-Corpus (MASC) including texts from diverse genres and manual annotations or manually-validated annotations for multiple levels, including WordNet senses and FrameNet frames and frame elements, both of which have become significant resources in the international computational linguistics community. To derive maximal benefit from the semantic information provided by these resources, the MASC will also include manually-validated shallow parses and named entities, which will enable linking WordNet senses and FrameNet frames within the same sentences into more complex semantic structures and, because named entities will often be the role fillers of FrameNet frames, enrich the semantic and pragmatic information derivable from the sub-corpus. All MASC annotations will be published with detailed inter-annotator agreement measures. The MASC and its annotations will be freely downloadable from the ANC website, thus providing maximum accessibility for researchers from around the globe.

pdf bib
A Bilingual Corpus of Inter-linked Events
Tommaso Caselli | Nancy Ide | Roberto Bartolini
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes the creation of a bilingual corpus of inter-linked events for Italian and English. Linkage is accomplished through the Inter-Lingual Index (ILI) that links ItalWordNet with WordNet. The availability of this resource, on the one hand, enables contrastive analysis of the linguistic phenomena surrounding events in both languages, and on the other hand, can be used to perform multilingual temporal analysis of texts. In addition to describing the methodology for construction of the inter-linked corpus and the analysis of the data collected, we demonstrate that the ILI could potentially be used to bootstrap the creation of comparable corpora by exporting layers of annotation for words that have the same sense.

2007

pdf bib
Proceedings of the Linguistic Annotation Workshop
Branimir Boguraev | Nancy Ide | Adam Meyers | Shigeko Nariyama | Manfred Stede | Janyce Wiebe | Graham Wilcock
Proceedings of the Linguistic Annotation Workshop

pdf bib
GrAF: A Graph-based Format for Linguistic Annotations
Nancy Ide | Keith Suderman
Proceedings of the Linguistic Annotation Workshop

pdf bib
Shared Corpora Working Group Report
Adam Meyers | Nancy Ide | Ludovic Denoyer | Yusuke Shinyama
Proceedings of the Linguistic Annotation Workshop

2006

pdf bib
Layering and Merging Linguistic Annotations
Keith Suderman | Nancy Ide
Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing

pdf bib
Integrating Linguistic Resources: The American National Corpus Model
Nancy Ide | Keith Suderman
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the architecture of the American National Corpus and the design decisions we have made in order to make the corpus easy to use with a variety of existing tools with varying functionality, and to allow for layering multiple annotations over the data. The overall goal of the ANC project is to provide an “open linguistic infrastructure” for American English, consisting of as many self-generated or contributed annotations of the data as possible together with derived. The availability of a wide variety of annotations for the same data and in a common format should significantly simplify the processing required to extract annotations from different sources and enable use of the ANC and its annotations with off-the-shelf software.

pdf bib
Representing Linguistic Corpora and Their Annotations
Nancy Ide | Laurent Romary
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

A Linguistic Annotation Framework (LAF) is being developed within the International Standards Organization Technical Committee 37 Sub-committee on Language Resource Management (ISO TC37 SC4). LAF is intended to provide a standardized means to represent linguistic data and its annotations that is defined broadly enough to accommodate all types of linguistic annotations, and at the same time provide means to represent precise and potentially complex linguistic information. The general principles informing the design of LAF have been previously reported (Ide and Romary, 2003; Ide and Romary, 2004a). This paper describes some of the more technical aspects of the LAF design that have been addressed in the process of finalizing the specifications for the standard.

2004

pdf bib
Word Sense Disambiguation as a Wordnets’ Validation Method in Balkanet
Dan Tufis | Radu Ion | Nancy Ide
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Exploiting Semantic Web Technologies for Intelligent Access to Historical Documents
Nancy Ide | David Woolner
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
The American National Corpus First Release
Nancy Ide | Keith Suderman
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
A Registry of Standard Data Categories for Linguistic Annotation
Nancy Ide | Laurent Romary
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Fine-Grained Word Sense Disambiguation Based on Parallel Corpora, Word Alignment, Word Clustering and Aligned Wordnets
Dan Tufis | Radu Ion | Nancy Ide
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
International Standard for a Linguistic Annotation Framework
Nancy Ide | Laurent Romary | Eric de la Clergerie
Proceedings of the HLT-NAACL 2003 Workshop on Software Engineering and Architecture of Language Technology Systems (SEALTS)

pdf bib
Outline of the International Standard Linguistic Annotation Framework
Nancy Ide | Laurent Romary
Proceedings of the ACL 2003 Workshop on Linguistic Annotation: Getting the Model Right

pdf bib
RDF Instantiation of ISLE/MILE Lexical Entries
Nancy Ide | Alessandro Lenci | Nicoletta Calzolari
Proceedings of the ACL 2003 Workshop on Linguistic Annotation: Getting the Model Right

2002

pdf bib
Towards Best Practice for Multiword Expressions in Computational Lexicons
Nicoletta Calzolari | Charles J. Fillmore | Ralph Grishman | Nancy Ide | Alessandro Lenci | Catherine MacLeod | Antonio Zampolli
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
The American National Corpus: More Than the Web Can Provide
Nancy Ide | Randi Reppen | Keith Suderman
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
Standards for Language Resources
Nancy Ide | Laurent Romary
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
Sense Discrimination with Parallel Corpora
Nancy Ide | Tomaz Erjavec | Dan Tufis
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

2001

pdf bib
A Common Framework for Syntactic Annotation
Nancy Ide | Laurent Romary
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

2000

pdf bib
XCES: An XML-based Encoding Standard for Linguistic Corpora
Nancy Ide | Patrice Bonhomme | Laurent Romary
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
The American National Corpus: A Standardized Resource for American English
Catherine Macleod | Nancy Ide | Ralph Grishman
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
The Concede Model for Lexical Databases
Tomaž Erjavec | Roger Evans | Nancy Ide | Adam Kilgarriff
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
An Empirical Investigation of the Relation Between Discourse Structure and Co-Reference
Dan Cristea | Nancy Ide | Daniel Marcu | Valentin Tablan
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf bib
A Hierarchical Account of Referential Accessibility
Nancy Ide | Dan Cristea
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf bib
The XML Framework and Its Implications for the Development of Natural Language Processing Tools
Nancy Ide
Proceedings of the COLING-2000 Workshop on Using Toolsets and Architectures To Build NLP Systems

1999

pdf bib
Discourse Structure and Co-Reference: An Empirical Study
Dan Cristea | Nancy Ide | Daniel Marcu | Valentin Tablan
The Relation of Discourse/Dialogue Structure and Reference

pdf bib
Parallel Translations as Sense Discriminators
Nancy Ide
SIGLEX99: Standardizing Lexical Resources

1998

pdf bib
Veins Theory: A Model of Global Discourse Cohesion and Coherence
Dan Cristea | Nancy Ide | Laurent Romary
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Multext-East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages
Ludmila Dimitrova | Tomaz Erjavec | Nancy Ide | Heiki Jaan Kaalep | Vladimir Petkevic | Dan Tufis
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Veins Theory: A Model of Global Discourse Cohesion and Coherence
Dan Cristea | Nancy Ide | Laurent Romary
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Multext-East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages
Ludmila Dimitrova | Tomaz Erjavec | Nancy Ide | Heiki Jaan Kaalep | Vladimir Petkevic | Dan Tufis
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Encoding Linguistic Corpora
Nancy Ide
Sixth Workshop on Very Large Corpora

pdf bib
Proceedings of the Third Conference on Empirical Methods for Natural Language Processing
Nancy Ide | Atro Voutilainen
Proceedings of the Third Conference on Empirical Methods for Natural Language Processing

pdf bib
Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art
Nancy Ide | Jean Véronis
Computational Linguistics, Volume 24, Number 1, March 1998 - Special Issue on Word Sense Disambiguation

pdf bib
Book Reviews: Text Databases: One Database Model and Several Retrieval Languages
Nancy Ide
Computational Linguistics, Volume 24, Number 2, June 1998

1994

pdf bib
Encoding standards for large text resources: The Text Encoding Initiative
Nancy Ide
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

pdf bib
MULTEXT: Multilingual Text Tools and Corpora
Nancy Ide | Jean Veronis
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

1993

bib
Knowledge extraction from machine-readable dictionaries: an evaluation
Nancy Ide | Jean Véronis
Third International EAMT Workshop: Machine Translation and the Lexicon

Machine-readable versions of everyday dictionaries have been seen as a likely source of information for use in natural language processing because they contain an enormous amount of lexical and semantic knowledge. However, after 15 years of research, the results appear to be disappointing. No comprehensive evaluation of machine-readable dictionaries (MRDs) as a knowledge source has been made to date, although this is necessary to determine what, if anything, can be gained from MRD research. To this end, this paper will first consider the postulates upon which MRD research has been based over the past fifteen years, discuss the validity of these postulates, and evaluate the results of this work. We will then propose possible future directions and applications that may exploit these years of effort, in the light of current directions in not only NLP research, but also fields such as lexicography and electronic publishing.

1992

pdf bib
A Feature-Based Model for Lexical Databases
Jean Veronis | Nancy Ide
COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics

1991

pdf bib
An Assessment of Semantic Information Automatically Extracted From Machine Readable Dictionaries
Jean Veronis | Nancy Ide
Fifth Conference of the European Chapter of the Association for Computational Linguistics

1990

pdf bib
Word Sense Disambiguation with Very Large Neural Networks Extracted from Machine Readable Dictionaries
Jean Veronis | Nancy M. Ide
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics

Search
Co-authors