Active Annotation in the LUNA Italian Corpus of Spontaneous Dialogues

Christian Raymond, Kepa Joseba Rodriguez, Giuseppe Riccardi


Abstract
In this paper we present an active approach to annotate with lexical and semantic labels an Italian corpus of conversational human-human and Wizard-of-Oz dialogues. This procedure consists in the use of a machine learner to assist human annotators in the labeling task. The computer assisted process engages human annotators to check and correct the automatic annotation rather than starting the annotation from un-annotated data. The active learning procedure is combined with an annotation error detection to control the reliablity of the annotation. With the goal of converging as fast as possible to reliable automatic annotations minimizing the human effort, we follow the active learning paradigm, which selects for annotation the most informative training examples required to achieve a better level of performance. We show that this procedure allows to quickly converge on correct annotations and thus minimize the cost of human supervision.
Anthology ID:
L08-1494
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/499_paper.pdf
DOI:
Bibkey:
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/499_paper.pdf