Evaluation of HMM-based Models for the Annotation of Unsegmented Dialogue Turns

Carlos-D. Martínez-Hinarejos; Vicent Tamarit; José-Miguel Benedí

Evaluation of HMM-based Models for the Annotation of Unsegmented Dialogue Turns

Carlos-D. Martínez-Hinarejos, Vicent Tamarit, José-M. Benedí

Abstract

Corpus-based dialogue systems rely on statistical models, whose parameters are inferred from annotated dialogues. The dialogues are usually annotated in terms of Dialogue Acts (DA), and the manual annotation is difficult (as annotation rule are hard to define), error-prone and time-consuming. Therefore, several semi-automatic annotation processes have been proposed to speed-up the process and consequently obtain a dialogue system in less total time. These processes are usually based on statistical models. The standard statistical annotation model is based on Hidden Markov Models (HMM). In this work, we explore the impact of different types of HMM, with different number of states, on annotation accuracy. We performed experiments using these models on two dialogue corpora (Dihana and SwitchBoard) of dissimilar features. The results show that some types of models improve standard HMM in a human-computer task-oriented dialogue corpus (Dihana corpus), but their impact is lower in a human-human non-task-oriented dialogue corpus (SwitchBoard corpus).

Anthology ID:: L10-1209
Volume:: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:: May
Year:: 2010
Address:: Valletta, Malta
Editors:: Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2010/pdf/303_Paper.pdf
DOI:
Bibkey:
Cite (ACL):: Carlos-D. Martínez-Hinarejos, Vicent Tamarit, and José-M. Benedí. 2010. Evaluation of HMM-based Models for the Annotation of Unsegmented Dialogue Turns. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):: Evaluation of HMM-based Models for the Annotation of Unsegmented Dialogue Turns (Martínez-Hinarejos et al., LREC 2010)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2010/pdf/303_Paper.pdf

PDF Cite Search Fix data