Let’s not Argue about Semantics

Johan Bos


Abstract
What’s the best way to assess the performance of a semantic component in an NLP system? Tradition in NLP evaluation tells us that comparing output against a gold standard is a good idea. To define a gold standard, one first needs to decide on the representation language, and in many cases a first-order language seems a good compromise between expressive power and efficiency. Secondly, one needs to decide how to represent the various semantic phenomena, in particular the depth of analysis of quantification, plurals, eventualities, thematic roles, scope, anaphora, presupposition, ellipsis, comparatives, superlatives, tense, aspect, and time-expressions. Hence it will be hard to come up with an annotation scheme unless one permits different level of semantic granularity. The alternative is a theory-neutral black-box type evaluation where we just look at how systems react on various inputs. For this approach, we can consider the well-known task of recognising textual entailment, or the lesser-known task of textual model checking. The disadvantage of black-box methods is that it is difficult to come up with natural data that cover specific semantic phenomena.
Anthology ID:
L08-1391
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/721_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Johan Bos. 2008. Let’s not Argue about Semantics. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Let’s not Argue about Semantics (Bos, LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/721_paper.pdf