Benjamin Yao


pdf bib
KEPLET: Knowledge-Enhanced Pretrained Language Model with Topic Entity Awareness
Yichuan Li | Jialong Han | Kyumin Lee | Chengyuan Ma | Benjamin Yao | Xiaohu Liu
Findings of the Association for Computational Linguistics: EMNLP 2023

In recent years, Pre-trained Language Models (PLMs) have shown their superiority by pre-training on unstructured text corpus and then fine-tuning on downstream tasks. On entity-rich textual resources like Wikipedia, Knowledge-Enhanced PLMs (KEPLMs) incorporate the interactions between tokens and mentioned entities in pre-training, and are thus more effective on entity-centric tasks such as entity linking and relation classification. Although exploiting Wikipedia’s rich structures to some extent, conventional KEPLMs still neglect a unique layout of the corpus where each Wikipedia page is around a topic entity (identified by the page URL and shown in the page title). In this paper, we demonstrate that KEPLMs without incorporating the topic entities will lead to insufficient entity interaction and biased (relation) word semantics. We thus propose KEPLET, a novel Knowledge-Énhanced Pre-trained LanguagE model with Topic entity awareness. In an end-to-end manner, KEPLET identifies where to add the topic entity’s information in a Wikipedia sentence, fuses such information into token and mentioned entities representations, and supervises the network learning, through which it takes topic entities back into consideration. Experiments demonstrated the generality and superiority of KEPLET which was applied to two representative KEPLMs, achieving significant improvements on four entity-centric tasks.

pdf bib
UseClean: learning from complex noisy labels in named entity recognition
Jinjin Tian | Kun Zhou | Meiguo Wang | Yu Zhang | Benjamin Yao | Xiaohu Liu | Chenlei Guo
Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD)

We investigate and refine denoising methods for NER task on data that potentially contains extremely noisy labels from multi-sources. In this paper, we first summarized all possible noise types and noise generation schemes, based on which we built a thorough evaluation system. We then pinpoint the bottleneck of current state-of-art denoising methods using our evaluation system. Correspondingly, we propose several refinements, including using a two-stage framework to avoid error accumulation; a novel confidence score utilizing minimal clean supervision to increase predictive power; an automatic cutoff fitting to save extensive hyper-parameter tuning; a warm started weighted partial CRF to better learn on the noisy tokens. Additionally, we propose to use adaptive sampling to further boost the performance in long-tailed entity settings. Our method improves F1 score by on average at least 5 10% over current state-of-art across extensive experiments.

pdf bib
PersonaPKT: Building Personalized Dialogue Agents via Parameter-efficient Knowledge Transfer
Xu Han | Bin Guo | Yoon Jung | Benjamin Yao | Yu Zhang | Xiaohu Liu | Chenlei Guo
Proceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)


pdf bib
Joint Goal Segmentation and Goal Success Prediction on Multi-Domain Conversations
Meiguo Wang | Benjamin Yao | Bin Guo | Xiaohu Liu | Yu Zhang | Tuan-Hung Pham | Chenlei Guo
Proceedings of the 29th International Conference on Computational Linguistics

To evaluate the performance of a multi-domain goal-oriented Dialogue System (DS), it is important to understand what the users’ goals are for the conversations and whether those goals are successfully achieved. The success rate of goals directly correlates with user satisfaction and perceived usefulness of the DS. In this paper, we propose a novel automatic dialogue evaluation framework that jointly performs two tasks: goal segmentation and goal success prediction. We extend the RoBERTa-IQ model (Gupta et al., 2021) by adding multi-task learning heads for goal segmentation and success prediction. Using an annotated dataset from a commercial DS, we demonstrate that our proposed model reaches an accuracy that is on-par with single-pass human annotation comparing to a three-pass gold annotation benchmark.