Stochastic Spoken Natural Language Parsing in the Framework of the French MEDIA Evaluation Campaign

Dirk Bühler, Wolfgang Minker


Abstract
A stochastic parsing component has been applied on a French spoken language dialogue corpus, recorded in the framework of the MEDIA evaluation campaign. Realized as an ergodic HMM using Viterbide coding, the parser outputs the most likely semantic representation given a transcribed utterance as input. The semantic sequences used for training and testing have been derived from the semantic representations of the MEDIA corpus. The HMM parameters have been estimated given the word sequences along with their semantic representation. The performance score of the stochastic parser has been automatically determined using the mediaval tool applied to a held out reference corpus. Evaluation results will be presented in the paper.
Anthology ID:
L06-1213
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/363_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Dirk Bühler and Wolfgang Minker. 2006. Stochastic Spoken Natural Language Parsing in the Framework of the French MEDIA Evaluation Campaign. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Stochastic Spoken Natural Language Parsing in the Framework of the French MEDIA Evaluation Campaign (Bühler & Minker, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/363_pdf.pdf