Haotian Chen
2023
Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction
Haotian Chen
|
Bingsheng Chen
|
Xiangdong Zhou
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Document-level relation extraction (DocRE) attracts more research interest recently. While models achieve consistent performance gains in DocRE, their underlying decision rules are still understudied: Do they make the right predictions according to rationales? In this paper, we take the first step toward answering this question and then introduce a new perspective on comprehensively evaluating a model. Specifically, we first conduct annotations to provide the rationales considered by humans in DocRE. Then, we conduct investigations and discover the fact that: In contrast to humans, the representative state-of-the-art (SOTA) models in DocRE exhibit different reasoning processes. Through our proposed RE-specific attacks, we next demonstrate that the significant discrepancy in decision rules between models and humans severely damages the robustness of models. After that, we introduce mean average precision (MAP) to evaluate the understanding and reasoning capabilities of models. According to the extensive experimental results, we finally appeal to future work to consider evaluating the understanding ability of models because the improved ability renders models more trustworthy and robust to be deployed in real-world scenarios. We make our annotations and code publicly available.
2020
Contextualized End-to-End Neural Entity Linking
Haotian Chen
|
Xi Li
|
Andrej Zukov Gregoric
|
Sahil Wadhwa
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
We propose an entity linking (EL) model that jointly learns mention detection (MD) and entity disambiguation (ED). Our model applies task-specific heads on top of shared BERT contextualized embeddings. We achieve state-of-the-art results across a standard EL dataset using our model; we also study our model’s performance under the setting when hand-crafted entity candidate sets are not available and find that the model performs well under such a setting too.