2020
pdf
bib
abs
SemEval-2020 Task 6: Definition Extraction from Free Text with the DEFT Corpus
Sasha Spala
|
Nicholas Miller
|
Franck Dernoncourt
|
Carl Dockhorn
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Research on definition extraction has been conducted for well over a decade, largely with significant constraints on the type of definitions considered. In this work, we present DeftEval, a SemEval shared task in which participants must extract definitions from free text using a term-definition pair corpus that reflects the complex reality of definitions in natural language. Definitions and glosses in free text often appear without explicit indicators, across sentences boundaries, or in an otherwise complex linguistic manner. DeftEval involved 3 distinct subtasks: 1) Sentence classification, 2) sequence labeling, and 3) relation extraction.
2019
pdf
bib
abs
DEFT: A corpus for definition extraction in free- and semi-structured text
Sasha Spala
|
Nicholas A. Miller
|
Yiming Yang
|
Franck Dernoncourt
|
Carl Dockhorn
Proceedings of the 13th Linguistic Annotation Workshop
Definition extraction has been a popular topic in NLP research for well more than a decade, but has been historically limited to well-defined, structured, and narrow conditions. In reality, natural language is messy, and messy data requires both complex solutions and data that reflects that reality. In this paper, we present a robust English corpus and annotation schema that allows us to explore the less straightforward examples of term-definition structures in free and semi-structured text.
2018
pdf
bib
A Comparison Study of Human Evaluated Automated Highlighting Systems
Sasha Spala
|
Franck Dernoncourt
|
Walter Chang
|
Carl Dockhorn
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation
pdf
bib
abs
A Web-based Framework for Collecting and Assessing Highlighted Sentences in a Document
Sasha Spala
|
Franck Dernoncourt
|
Walter Chang
|
Carl Dockhorn
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations
Automatically highlighting a text aims at identifying key portions that are the most important to a reader. In this paper, we present a web-based framework designed to efficiently and scalably crowdsource two independent but related tasks: collecting highlight annotations, and comparing the performance of automated highlighting systems. The first task is necessary to understand human preferences and train supervised automated highlighting systems. The second task yields a more accurate and fine-grained evaluation than existing automated performance metrics.