Shengqi Zhu


2024

pdf bib
“Get Their Hands Dirty, Not Mine”: On Researcher-Annotator Collaboration and the Agency of Annotators
Shengqi Zhu | Jeffrey Rzeszotarski
Findings of the Association for Computational Linguistics: ACL 2024

Annotation quality is often framed as post-hoc cleanup of annotator-caused issues. This position paper discusses whether, how, and why this narrative limits the scope of improving annotation. We call to consider annotation as a procedural collaboration, outlining three points in this direction:(1) An issue can be either annotator- or researcher-oriented, where one party is accountable and the other party may lack ability to fix it; (2) yet, they can co-occur or have similar consequences, and thus any specific problem we encounter may be a combination;(3) therefore, we need a new language to capture the nuance and holistically describe the full procedure to resolve these issues.To that end, we propose to study how agency is manifested in annotation and picture how this perspective benefits the community more broadly.

2023

pdf bib
More than Classification: A Unified Framework for Event Temporal Relation Extraction
Quzhe Huang | Yutong Hu | Shengqi Zhu | Yansong Feng | Chang Liu | Dongyan Zhao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Event temporal relation extraction (ETRE) is usually formulated as a multi-label classification task, where each type of relation is simply treated as a one-hot label. This formulation ignores the meaning of relations and wipes out their intrinsic dependency. After examining the relation definitions in various ETRE tasks, we observe that all relations can be interpreted using the start and end time points of events. For example, relation Includes could be interpreted as event 1 starting no later than event 2 and ending no earlier than event 2. In this paper, we propose a unified event temporal relation extraction framework, which transforms temporal relations into logical expressions of time points and completes the ETRE by predicting the relations between certain time point pairs. Experiments on TB-Dense and MATRES show significant improvements over a strong baseline and outperform the state-of-the-art model by 0.3% on both datasets. By representing all relations in a unified framework, we can leverage the relations with sufficient data to assist the learning of other relations, thus achieving stable improvement in low-data scenarios. When the relation definitions are changed, our method can quickly adapt to the new ones by simply modifying the logic expressions that map time points to new event relations. The code is released at https://github.com/AndrewZhe/A-Unified-Framework-for-ETRE

2022

pdf bib
Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in DocRED
Quzhe Huang | Shibo Hao | Yuan Ye | Shengqi Zhu | Yansong Feng | Dongyan Zhao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

DocRED is a widely used dataset for document-level relation extraction. In the large-scale annotation, a recommend-revise scheme is adopted to reduce the workload. Within this scheme, annotators are provided with candidate relation instances from distant supervision, and they then manually supplement and remove relational facts based on the recommendations. However, when comparing DocRED with a subset relabeled from scratch, we find that this scheme results in a considerable amount of false negative samples and an obvious bias towards popular entities and relations. Furthermore, we observe that the models trained on DocRED have low recall on our relabeled dataset and inherit the same bias in the training data. Through the analysis of annotators’ behaviors, we figure out the underlying reason for the problems above: the scheme actually discourages annotators from supplementing adequate instances in the revision phase. We appeal to future research to take into consideration the issues with the recommend-revise scheme when designing new models and annotation schemes. The relabeled dataset is released at https://github.com/AndrewZhe/Revisit-DocRED, to serve as a more reliable test set of document RE models.

2021

pdf bib
Exploring Distantly-Labeled Rationales in Neural Network Models
Quzhe Huang | Shengqi Zhu | Yansong Feng | Dongyan Zhao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Recent studies strive to incorporate various human rationales into neural networks to improve model performance, but few pay attention to the quality of the rationales. Most existing methods distribute their models’ focus to distantly-labeled rationale words entirely and equally, while ignoring the potential important non-rationale words and not distinguishing the importance of different rationale words. In this paper, we propose two novel auxiliary loss functions to make better use of distantly-labeled rationales, which encourage models to maintain their focus on important words beyond labeled rationales (PINs) and alleviate redundant training on non-helpful rationales (NoIRs). Experiments on two representative classification tasks show that our proposed methods can push a classification model to effectively learn crucial clues from non-perfect rationales while maintaining the ability to spread its focus to other unlabeled important words, thus significantly outperform existing methods.

pdf bib
Three Sentences Are All You Need: Local Path Enhanced Document Relation Extraction
Quzhe Huang | Shengqi Zhu | Yansong Feng | Yuan Ye | Yuxuan Lai | Dongyan Zhao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Document-level Relation Extraction (RE) is a more challenging task than sentence RE as it often requires reasoning over multiple sentences. Yet, human annotators usually use a small number of sentences to identify the relationship between a given entity pair. In this paper, we present an embarrassingly simple but effective method to heuristically select evidence sentences for document-level RE, which can be easily combined with BiLSTM to achieve good performance on benchmark datasets, even better than fancy graph neural network based methods. We have released our code at https://github.com/AndrewZhe/Three-Sentences-Are-All-You-Need.