Tzu Hsuan Chou
2023
Advancing Multi-Criteria Chinese Word Segmentation Through Criterion Classification and Denoising
Tzu Hsuan Chou
|
Chun-Yi Lin
|
Hung-Yu Kao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent research on multi-criteria Chinese word segmentation (MCCWS) mainly focuses on building complex private structures, adding more handcrafted features, or introducing complex optimization processes. In this work, we show that through a simple yet elegant input-hint-based MCCWS model, we can achieve state-of-the-art (SoTA) performances on several datasets simultaneously. We further propose a novel criterion-denoising objective that hurts slightly on F1 score but achieves SoTA recall on out-of-vocabulary words. Our result establishes a simple yet strong baseline for future MCCWS research. Source code is available at https://github.com/IKMLab/MCCWS.
2019
Fill the GAP: Exploiting BERT for Pronoun Resolution
Kai-Chou Yang
|
Timothy Niven
|
Tzu Hsuan Chou
|
Hung-Yu Kao
Proceedings of the First Workshop on Gender Bias in Natural Language Processing
In this paper, we describe our entry in the gendered pronoun resolution competition which achieved fourth place without data augmentation. Our method is an ensemble system of BERTs which resolves co-reference in an interaction space. We report four insights from our work: BERT’s representations involve significant redundancy; modeling interaction effects similar to natural language inference models is useful for this task; there is an optimal BERT layer to extract representations for pronoun resolution; and the difference between the attention weights from the pronoun to the candidate entities was highly correlated with the correct label, with interesting implications for future work.