Andrew W. Cole

Also published as: Andrew Cole


2006

pdf bib
Corpus Development and Publication
Andrew W. Cole
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper will discuss issues relevant to corpus development and publication at the LDC and will illustrate those issues by examining the history of three LDC corpora. This paper will also briefly examine alternative corpus creation and distribution methods and their challenges. The intent of this paper is to increase the available linguistic resources by describing the regulatory and technical environment and thus improving the understanding and interaction between corpus providers and distributors.

pdf bib
Integrated Linguistic Resources for Language Exploitation Technologies
Stephanie Strassel | Christopher Cieri | Andrew Cole | Denise Dipersio | Mark Liberman | Xiaoyi Ma | Mohamed Maamouri | Kazuaki Maeda
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Linguistic Data Consortium has recently embarked on an effort to create integrated linguistic resources and related infrastructure for language exploitation technologies within the DARPA GALE (Global Autonomous Language Exploitation) Program. GALE targets an end-to-end system consisting of three major engines: Transcription, Translation and Distillation. Multilingual speech or text from a variety of genres is taken as input and English text is given as output, with information of interest presented in an integrated and consolidated fashion to the end user. GALE's goals require a quantum leap in the performance of human language technology, while also demanding solutions that are more intelligent, more robust, more adaptable, more efficient and more integrated. LDC has responded to this challenge with a comprehensive approach to linguistic resource development designed to support GALE's research and evaluation needs and to provide lasting resources for the larger Human Language Technology community.