Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
2020-11
Association for Computational Linguistics
Online
text
conference publication
eval4nlp-2020-evaluation
https://aclanthology.org/2020.eval4nlp-1.0
Truth or Error? Towards systematic analysis of factual errors in abstractive summaries
Klaus-Michael
Lux
author
Maya
Sappelli
author
Martha
Larson
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
lux-etal-2020-truth
10.18653/v1/2020.eval4nlp-1.1
https://aclanthology.org/2020.eval4nlp-1.1
2020-11
1
10
Fill in the BLANC: Human-free quality estimation of document summaries
Oleg
Vasilyev
author
Vedant
Dharnidharka
author
John
Bohannon
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
vasilyev-etal-2020-fill
10.18653/v1/2020.eval4nlp-1.2
https://aclanthology.org/2020.eval4nlp-1.2
2020-11
11
20
Item Response Theory for Efficient Human Evaluation of Chatbots
João
Sedoc
author
Lyle
Ungar
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
sedoc-ungar-2020-item
10.18653/v1/2020.eval4nlp-1.3
https://aclanthology.org/2020.eval4nlp-1.3
2020-11
21
33
ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT
Hwanhee
Lee
author
Seunghyun
Yoon
author
Franck
Dernoncourt
author
Doo
Soon
Kim
author
Trung
Bui
author
Kyomin
Jung
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
lee-etal-2020-vilbertscore
10.18653/v1/2020.eval4nlp-1.4
https://aclanthology.org/2020.eval4nlp-1.4
2020-11
34
39
BLEU Neighbors: A Reference-less Approach to Automatic Evaluation
Kawin
Ethayarajh
author
Dorsa
Sadigh
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
ethayarajh-sadigh-2020-bleu
10.18653/v1/2020.eval4nlp-1.5
https://aclanthology.org/2020.eval4nlp-1.5
2020-11
40
50
Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance
Xi
Chen
author
Nan
Ding
author
Tomer
Levinboim
author
Radu
Soricut
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
chen-etal-2020-improving-text
10.18653/v1/2020.eval4nlp-1.6
https://aclanthology.org/2020.eval4nlp-1.6
2020-11
51
59
On the Evaluation of Machine Translation n-best Lists
Jacob
Bremerman
author
Huda
Khayrallah
author
Douglas
Oard
author
Matt
Post
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
bremerman-etal-2020-evaluation
10.18653/v1/2020.eval4nlp-1.7
https://aclanthology.org/2020.eval4nlp-1.7
2020-11
60
68
Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization
Rahul
Jha
author
Keping
Bi
author
Yang
Li
author
Mahdi
Pakdaman
author
Asli
Celikyilmaz
author
Ivan
Zhiboedov
author
Kieran
McDonald
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
jha-etal-2020-artemis
10.18653/v1/2020.eval4nlp-1.8
https://aclanthology.org/2020.eval4nlp-1.8
2020-11
69
78
Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models
Reda
Yacouby
author
Dustin
Axman
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
yacouby-axman-2020-probabilistic
10.18653/v1/2020.eval4nlp-1.9
https://aclanthology.org/2020.eval4nlp-1.9
2020-11
79
91
A survey on Recognizing Textual Entailment as an NLP Evaluation
Adam
Poliak
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
poliak-2020-survey
10.18653/v1/2020.eval4nlp-1.10
https://aclanthology.org/2020.eval4nlp-1.10
2020-11
92
109
Grammaticality and Language Modelling
Jingcheng
Niu
author
Gerald
Penn
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
niu-penn-2020-grammaticality
10.18653/v1/2020.eval4nlp-1.11
https://aclanthology.org/2020.eval4nlp-1.11
2020-11
110
119
One of these words is not like the other: a reproduction of outlier identification using non-contextual word representations
Jesper
Brink Andersen
author
Mikkel
Bak Bertelsen
author
Mikkel
Hørby Schou
author
Manuel
R
Ciosici
author
Ira
Assent
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
brink-andersen-etal-2020-one
10.18653/v1/2020.eval4nlp-1.12
https://aclanthology.org/2020.eval4nlp-1.12
2020-11
120
130
Are Some Words Worth More than Others?
Shiran
Dudy
author
Steven
Bedrick
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
dudy-bedrick-2020-words
10.18653/v1/2020.eval4nlp-1.13
https://aclanthology.org/2020.eval4nlp-1.13
2020-11
131
142
On Aligning OpenIE Extractions with Knowledge Bases: A Case Study
Kiril
Gashteovski
author
Rainer
Gemulla
author
Bhushan
Kotnis
author
Sven
Hertling
author
Christian
Meilicke
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
gashteovski-etal-2020-aligning
10.18653/v1/2020.eval4nlp-1.14
https://aclanthology.org/2020.eval4nlp-1.14
2020-11
143
154
ClusterDataSplit: Exploring Challenging Clustering-Based Data Splits for Model Performance Evaluation
Hanna
Wecker
author
Annemarie
Friedrich
author
Heike
Adel
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
wecker-etal-2020-clusterdatasplit
10.18653/v1/2020.eval4nlp-1.15
https://aclanthology.org/2020.eval4nlp-1.15
2020-11
155
163
Best Practices for Crowd-based Evaluation of German Summarization: Comparing Crowd, Expert and Automatic Evaluation
Neslihan
Iskender
author
Tim
Polzehl
author
Sebastian
Möller
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
iskender-etal-2020-best
10.18653/v1/2020.eval4nlp-1.16
https://aclanthology.org/2020.eval4nlp-1.16
2020-11
164
175
Evaluating Word Embeddings on Low-Resource Languages
Nathan
Stringham
author
Mike
Izbicki
author
2020-11
text
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Steffen
Eger
editor
Yang
Gao
editor
Maxime
Peyrard
editor
Wei
Zhao
editor
Eduard
Hovy
editor
Association for Computational Linguistics
Online
conference publication
stringham-izbicki-2020-evaluating
10.18653/v1/2020.eval4nlp-1.17
https://aclanthology.org/2020.eval4nlp-1.17
2020-11
176
186