Hongxiang Li


2024

pdf bib
Chem-FINESE: Validating Fine-Grained Few-shot Entity Extraction through Text Reconstruction
Qingyun Wang | Zixuan Zhang | Hongxiang Li | Xuan Liu | Jiawei Han | Huimin Zhao | Heng Ji
Findings of the Association for Computational Linguistics: EACL 2024

Fine-grained few-shot entity extraction in the chemical domain faces two unique challenges. First, compared with entity extraction tasks in the general domain, sentences from chemical papers usually contain more entities. Moreover, entity extraction models usually have difficulty extracting entities of long-tailed types. In this paper, we propose Chem-FINESE, a novel sequence-to-sequence (seq2seq) based few-shot entity extraction approach, to address these two challenges. Our Chem-FINESE has two components: a seq2seq entity extractor to extract named entities from the input sentence and a seq2seq self-validation module to reconstruct the original input sentence from extracted entities. Inspired by the fact that a good entity extraction system needs to extract entities faithfully, our new self-validation module leverages entity extraction results to reconstruct the original input sentence. Besides, we design a new contrastive loss to reduce excessive copying during the extraction process. Finally, we release ChemNER+, a new fine-grained chemical entity extraction dataset that is annotated by domain experts with the ChemNER schema. Experiments in few-shot settings with both ChemNER+ and CHEMET datasets show that our newly proposed framework has contributed up to 8.26% and 6.84% absolute F1-score gains respectively.

2023

pdf bib
Reaction Miner: An Integrated System for Chemical Reaction Extraction from Textual Data
Ming Zhong | Siru Ouyang | Yizhu Jiao | Priyanka Kargupta | Leo Luo | Yanzhen Shen | Bobby Zhou | Xianrui Zhong | Xuan Liu | Hongxiang Li | Jinfeng Xiao | Minhao Jiang | Vivian Hu | Xuan Wang | Heng Ji | Martin Burke | Huimin Zhao | Jiawei Han
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Chemical reactions, as a core entity in the realm of chemistry, hold crucial implications in diverse areas ranging from hands-on laboratory research to advanced computational drug design. Despite a burgeoning interest in employing NLP techniques to extract these reactions, aligning this task with the real-world requirements of chemistry practitioners remains an ongoing challenge. In this paper, we present Reaction Miner, a system specifically designed to interact with raw scientific literature, delivering precise and more informative chemical reactions. Going beyond mere extraction, Reaction Miner integrates a holistic workflow: it accepts PDF files as input, bypassing the need for pre-processing and bolstering user accessibility. Subsequently, a text segmentation module ensures that the refined text encapsulates complete chemical reactions, augmenting the accuracy of extraction. Moreover, Reaction Miner broadens the scope of existing pre-defined reaction roles, including vital attributes previously neglected, thereby offering a more comprehensive depiction of chemical reactions. Evaluations conducted by chemistry domain users highlight the efficacy of each module in our system, demonstrating Reaction Miner as a powerful tool in this field.

pdf bib
ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding
Xuxin Cheng | Bowen Cao | Qichen Ye | Zhihong Zhu | Hongxiang Li | Yuexian Zou
Findings of the Association for Computational Linguistics: ACL 2023

Spoken language understanding (SLU) is a fundamental task in the task-oriented dialogue systems. However, the inevitable errors from automatic speech recognition (ASR) usually impair the understanding performance and lead to error propagation. Although there are some attempts to address this problem through contrastive learning, they (1) treat clean manual transcripts and ASR transcripts equally without discrimination in fine-tuning; (2) neglect the fact that the semantically similar pairs are still pushed away when applying contrastive learning; (3) suffer from the problem of Kullback–Leibler (KL) vanishing. In this paper, we propose Mutual Learning and Large-Margin Contrastive Learning (ML-LMCL), a novel framework for improving ASR robustness in SLU. Specifically, in fine-tuning, we apply mutual learning and train two SLU models on the manual transcripts and the ASR transcripts, respectively, aiming to iteratively share knowledge between these two models. We also introduce a distance polarization regularizer to avoid pushing away the intra-cluster pairs as much as possible. Moreover, we use a cyclical annealing schedule to mitigate KL vanishing issue. Experiments on three datasets show that ML-LMCL outperforms existing models and achieves new state-of-the-art performance.

pdf bib
Accelerating Multiple Intent Detection and Slot Filling via Targeted Knowledge Distillation
Xuxin Cheng | Zhihong Zhu | Wanshi Xu | Yaowei Li | Hongxiang Li | Yuexian Zou
Findings of the Association for Computational Linguistics: EMNLP 2023

Recent non-autoregressive Spoken Language Understanding (SLU) models have attracted increasing attention because of their encouraging inference speed. However, most of existing methods (1) suffer from the multi-modality problem since they have little prior knowledge about the reference during inference; (2) fail to achieve a satisfactory inference speed limited by their complex frameworks. To tackle these issues, in this paper, we propose a Targeted Knowledge Distillation Framework (TKDF) for multi-intent SLU, which utilizes the knowledge distillation method to improve the performance. Specifically, we first train an SLU model as the teacher model, which has higher accuracy while slower inference speed. Then we introduce an evaluator and apply a curriculum learning strategy to select proper targets for the student model. Experiment results on two public multi-intent datasets show that our approach can realize a flexible trade-off between inference speed and accuracy, achieving comparable performance to the state-of-the-art models while speeding up by over 4.5 times. More encouragingly, further analysis shows that distilling only 4% of the original data can help the student model outperform its counterpart trained on the original data by about 14.6% in terms of overall accuracy on MixATIS dataset.