Workshop on Evaluation and Comparison of NLP Systems (2022)


up

pdf (full)
bib (full)
Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems

pdf bib
Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems
Daniel Deutsch | Can Udomcharoenchaikit | Juri Opitz | Yang Gao | Marina Fomicheva | Steffen Eger

pdf bib
A Japanese Corpus of Many Specialized Domains for Word Segmentation and Part-of-Speech Tagging
Shohei Higashiyama | Masao Ideuchi | Masao Utiyama | Yoshiaki Oida | Eiichiro Sumita

pdf bib
Assessing Resource-Performance Trade-off of Natural Language Models using Data Envelopment Analysis
Zachary Zhou | Alisha Zachariah | Devin Conathan | Jeffery Kline

pdf bib
From COMET to COMES – Can Summary Evaluation Benefit from Translation Evaluation?
Mateusz Krubiński | Pavel Pecina

pdf bib
Better Smatch = Better Parser? AMR evaluation is not so simple anymore
Juri Opitz | Anette Frank

pdf bib
GLARE: Generative Left-to-right AdversaRial Examples
Ryan Andrew Chi | Nathan Kim | Patrick Liu | Zander Lack | Ethan A Chi

pdf bib
Random Text Perturbations Work, but not Always
Zhengxiang Wang

pdf bib
A Comparative Analysis of Stance Detection Approaches and Datasets
Parush Gera | Tempestt Neal

pdf bib
Why is sentence similarity benchmark not predictive of application-oriented task performance?
Kaori Abe | Sho Yokoi | Tomoyuki Kajiwara | Kentaro Inui

pdf bib
Chat Translation Error Detection for Assisting Cross-lingual Communications
Yunmeng Li | Jun Suzuki | Makoto Morishita | Kaori Abe | Ryoko Tokuhisa | Ana Brassard | Kentaro Inui

pdf bib
Evaluating the role of non-lexical markers in GPT-2’s language modeling behavior
Roberta Rocca | Alejandro de la Vega

pdf bib
Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset
Guanyi Chen | Fahime Same | Kees Van Deemter