Yu Yang

2024

Scented-EAE: Stage-Customized Entity Type Embedding for Event Argument Extraction
Yu Yang | Jinyu Guo | Kai Shuang | Chenrui Mao
Findings of the Association for Computational Linguistics: ACL 2024

Existing methods for incorporating entities into EAE rely on prompts or NER. They typically fail to explicitly explore the role of entity types, which results in shallow argument comprehension and often encounter three issues: (1) weak semantic associations due to missing role-entity correspondence cues; (2) compromised semantic integrity from abandoning context after recognizing entities regardless of their types; (3) one-sided semantic understanding relying solely on argument role semantics. To tackle these issues, we propose Scented-EAE, an EAE model with stage-customized entity type embedding to explicitly underscore and explore the role of entity types, thus intervening in argument selection. Specifically, at the input stage, we strengthen semantic associations by prompting role-entity correspondence after extending a non-autoregressive decoder as part of the encoder. At the intermediate stage, we preserve semantic integrity by optimizing our proposed BIO-aware NER and EAE via a novel IPE joint learning. At the output stage, we expand semantic understanding dimensions by determining arguments using span selectors from argument roles and entity types. Experiments show that our model achieves state-of-the-art performance on mainstream benchmarks. In addition, it also exhibits robustness in low-resource settings with the help of prompts and entity types.

2023

pdf bib abs

Boosting Summarization with Normalizing Flows and Aggressive Training
Yu Yang | Xiaotong Shen
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

This paper presents FlowSUM, a normalizing flows-based variational encoder-decoder framework for Transformer-based summarization. Our approach tackles two primary challenges in variational summarization: insufficient semantic information in latent representations and posterior collapse during training. To address these challenges, we employ normalizing flows to enable flexible latent posterior modeling, and we propose a controlled alternate aggressive training (CAAT) strategy with an improved gate mechanism. Experimental results show that FlowSUM significantly enhances the quality of generated summaries and unleashes the potential for knowledge distillation with minimal impact on inference time. Furthermore, we investigate the issue of posterior collapse in normalizing flows and analyze how the summary quality is affected by the training strategy, gate initialization, and the type and number of normalizing flows used, offering valuable insights for future research.

2021

pdf bib abs

We consider the problem of scaling automated suggested replies for a commercial email application to multiple languages. Faced with increased compute requirements and low language resources for language expansion, we build a single universal model for improving the quality and reducing run-time costs of our production system. However, restricted data movement across regional centers prevents joint training across languages. To this end, we propose a multi-lingual multi-task continual learning framework, with auxiliary tasks and language adapters to train universal language representation across regions. The experimental results show positive cross-lingual transfer across languages while reducing catastrophic forgetting across regions. Our online results on real user traffic show significant CTR and Char-saved gain as well as 65% training cost reduction compared with per-language models. As a consequence, we have scaled the feature in multiple languages including low-resource markets.

Co-authors

Venues

Fix author