Task-based MT Evaluation: From Who/When/Where Extraction to Event Understanding

Jamal Laoudi, Calandra R. Tate, Clare R. Voss


Abstract
Task-based machine translation (MT) evaluation asks, how well do people perform text-handling tasks given MT output? This method of evaluation yields an extrinsic assessment of an MT engine, in terms of users’ task performance on MT output. While this method is time-consuming, its key advantage is that MT users and stakeholders understand how to interpret the assessment results. Prior experiments showed that subjects can extract individual who-, when-, and where-type elements of information from MT output passages that were not especially fluent. This paper presents the results of a pilot study to assess a slightly more complex task: when given such wh-items already identified in an MT output passage, how well can subjects properly select from and place these items into wh-typed slots to complete a sentence-template about the passage’s event? The results of the pilot with nearly sixty subjects, while only preliminary, indicate that this task was extremely challenging: given six test templates to complete, half of the subjects had no completely correct templates and 42% had exactly one completely correct template. The provisional interpretation of this pilot study is that event-based template completion defines a task ceiling, against which to evaluate future improvements on MT engines.
Anthology ID:
L06-1487
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/779_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Jamal Laoudi, Calandra R. Tate, and Clare R. Voss. 2006. Task-based MT Evaluation: From Who/When/Where Extraction to Event Understanding. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Task-based MT Evaluation: From Who/When/Where Extraction to Event Understanding (Laoudi et al., LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/779_pdf.pdf