Carl Rubino


2022

pdf bib
The IARPA BETTER Program Abstract Task Four New Semantically Annotated Corpora from IARPA’s BETTER Program
Timothy Mckinnon | Carl Rubino
Proceedings of the Thirteenth Language Resources and Evaluation Conference

IARPA’s Better Extraction from Text Towards Enhanced Retrieval (BETTER) Program created multiple multilingual datasets to spawn and evaluate cross-language information extraction and information retrieval research and development in zero-shot conditions. The first set of these resources for information extraction, the “Abstract” data will be released to the public at LREC 2022 in four languages to champion further information extraction work in this area. This paper presents the event and argument annotation in the Abstract Evaluation phase of BETTER, as well as the data collection, preparation, partitioning and mark-up of the datasets.

2020

pdf bib
The Effect of Linguistic Parameters in CLIR Performance
Carl Rubino
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020)

This paper will detail how IARPA’s MATERIAL Cross-Language Information Retrieval (CLIR) program investigated certain linguistic parameters to guide language choice, data collection and partitioning, and understand evaluation results. Discerning which linguistic parameters correlated with overall performance enabled the evaluation of progress when different languages were measured, and also was an important factor in determining the most effective CLIR pipeline design, customized to handle language-specific properties deemed necessary to address.

2018

pdf bib
Keynote: Setting up a Machine Translation Program for IARPA
Carl Rubino
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)

2016

bib
Machine Translation for English Retrieval of Information in Any Language (Machine translation for English-based domain-appropriate triage of information in any language)
Carl Rubino
Conferences of the Association for Machine Translation in the Americas: MT Users' Track