Ante Wang


2024

pdf bib
Improving LLM Generations via Fine-Grained Self-Endorsement
Ante Wang | Linfeng Song | Baolin Peng | Lifeng Jin | Ye Tian | Haitao Mi | Jinsong Su | Dong Yu
Findings of the Association for Computational Linguistics: ACL 2024

This work studies mitigating fact-conflicting hallucinations for large language model (LLM) at inference time.Particularly, we propose a self-endorsement framework that leverages the fine-grained fact-level comparisons across multiple sampled responses.Compared with prior ensemble methods (e.g., self-consistency) that perform response-level selection, our approach can better alleviate hallucinations for knowledge-intensive tasks.Our approach can broadly benefit smaller and open-source LLMs as it mainly conducts simple content-based comparisons.Experiments on Biographies show that our method can effectively improve the factuality of generations with simple and intuitive prompts across different scales of LLMs.Besides, comprehensive analyses on TriviaQA and GSM8K demonstrate the potential of self-endorsement for broader application.

pdf bib
Self-Consistency Boosts Calibration for Math Reasoning
Ante Wang | Linfeng Song | Ye Tian | Baolin Peng | Lifeng Jin | Haitao Mi | Jinsong Su | Dong Yu
Findings of the Association for Computational Linguistics: EMNLP 2024

pdf bib
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal
Jianheng Huang | Leyang Cui | Ante Wang | Chengyi Yang | Xinting Liao | Linfeng Song | Junfeng Yao | Jinsong Su
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large language models (LLMs) suffer from catastrophic forgetting during continual learning. Conventional rehearsal-based methods rely on previous training data to retain the model’s ability, which may not be feasible in real-world applications. When conducting continual learning based on a publicly-released LLM checkpoint, the availability of the original training data may be non-existent. To address this challenge, we propose a framework called Self-Synthesized Rehearsal (SSR) that uses the LLM to generate synthetic instances for rehearsal. Concretely, we first employ the base LLM for in-context learning to generate synthetic instances. Subsequently, we utilize the latest LLM to refine the instance outputs based on the synthetic inputs, preserving its acquired ability. Finally, we select diverse high-quality synthetic instances for rehearsal in future stages. Experimental results demonstrate that SSR achieves superior or comparable performance compared to conventional rehearsal-based approaches while being more data-efficient. Besides, SSR effectively preserves the generalization capabilities of LLMs in general domains.

pdf bib
EmoTrans: Emotional Transition-based Model for Emotion Recognition in Conversation
Zhongquan Jian | Ante Wang | Jinsong Su | Junfeng Yao | Meihong Wang | Qingqiang Wu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In an emotional conversation, emotions are causally transmitted among communication participants, constituting a fundamental conversational feature that can facilitate the comprehension of intricate changes in emotional states during the conversation and contribute to neutralizing emotional semantic bias in utterance caused by the absence of modality information. Therefore, emotional transition (ET) plays a crucial role in the task of Emotion Recognition in Conversation (ERC) that has not received sufficient attention in current research. In light of this, an Emotional Transition-based Emotion Recognizer (EmoTrans) is proposed in this paper. Specifically, we concatenate the most recent utterances with their corresponding speakers to construct the model input, known as samples, each with several placeholders to implicitly express the emotions of contextual utterances. Based on these placeholders, two components are developed to make the model sensitive to emotions and effectively capture the ET features in the sample. Furthermore, an ET-based Contrastive Learning (CL) is developed to compact the representation space, making the model achieve more robust sample representations. We conducted exhaustive experiments on four widely used datasets and obtained competitive experimental results, especially, new state-of-the-art results obtained on MELD and IEMOCAP, demonstrating the superiority of EmoTrans.

2023

pdf bib
Domain Adaptation for Conversational Query Production with the RAG Model Feedback
Ante Wang | Linfeng Song | Ge Xu | Jinsong Su
Findings of the Association for Computational Linguistics: EMNLP 2023

Conversational query production is an emerging fundamental task for the dialogue system, where search queries are generated to explore the vast and continually updating knowledge from a search engine. To accelerate this line of research, previous studies have released several datasets with human-annotated search queries. However, the limited annotations still can not cover conversations of various domains. To solve this challenge, we propose a novel domain adaptation framework. It is inspired by a weakly supervised learning algorithm from previous work that guides a model using reinforcement learning with BM25 scores as feedback. Though effective, it is fragile facing noisy content on webpages from a commercial search engine and variance in conversations because of ignoring deep semantic information of dialogue contexts. Thus, we improve the algorithm by taking the advance of retrieval-augmented generation (RAG) and exploring several practical techniques such as knowledge distillation for stable training. We conduct experiments in multiple settings across different languages. Guided by the RAG model feedback, our model is more robust and performs significantly better especially in a more challenging setting over strong baselines.

pdf bib
OpenFact: Factuality Enhanced Open Knowledge Extraction
Linfeng Song | Ante Wang | Xiaoman Pan | Hongming Zhang | Dian Yu | Lifeng Jin | Haitao Mi | Jinsong Su | Yue Zhang | Dong Yu
Transactions of the Association for Computational Linguistics, Volume 11

We focus on the factuality property during the extraction of an OpenIE corpus named OpenFact, which contains more than 12 million high-quality knowledge triplets. We break down the factuality property into two important aspects—expressiveness and groundedness—and we propose a comprehensive framework to handle both aspects. To enhance expressiveness, we formulate each knowledge piece in OpenFact based on a semantic frame. We also design templates, extra constraints, and adopt human efforts so that most OpenFact triplets contain enough details. For groundedness, we require the main arguments of each triplet to contain linked Wikidata1 entities. A human evaluation suggests that the OpenFact triplets are much more accurate and contain denser information compared to OPIEC-Linked (Gashteovski et al., 2019), one recent high-quality OpenIE corpus grounded to Wikidata. Further experiments on knowledge base completion and knowledge base question answering show the effectiveness of OpenFact over OPIEC-Linked as supplementary knowledge to Wikidata as the major KG.

2021

pdf bib
BACO: A Background Knowledge- and Content-Based Framework for Citing Sentence Generation
Yubin Ge | Ly Dinh | Xiaofeng Liu | Jinsong Su | Ziyao Lu | Ante Wang | Jana Diesner
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper, we focus on the problem of citing sentence generation, which entails generating a short text to capture the salient information in a cited paper and the connection between the citing and cited paper. We present BACO, a BAckground knowledge- and COntent-based framework for citing sentence generation, which considers two types of information: (1) background knowledge by leveraging structural information from a citation network; and (2) content, which represents in-depth information about what to cite and why to cite. First, a citation network is encoded to provide background knowledge. Second, we apply salience estimation to identify what to cite by estimating the importance of sentences in the cited paper. During the decoding stage, both types of information are combined to facilitate the text generation, and then we conduct a joint training for the generator and citation function classification to make the model aware of why to cite. Our experimental results show that our framework outperforms comparative baselines.

pdf bib
Improving Graph-based Sentence Ordering with Iteratively Predicted Pairwise Orderings
Shaopeng Lai | Ante Wang | Fandong Meng | Jie Zhou | Yubin Ge | Jiali Zeng | Junfeng Yao | Degen Huang | Jinsong Su
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Dominant sentence ordering models can be classified into pairwise ordering models and set-to-sequence models. However, there is little attempt to combine these two types of models, which inituitively possess complementary advantages. In this paper, we propose a novel sentence ordering framework which introduces two classifiers to make better use of pairwise orderings for graph-based sentence ordering (Yin et al. 2019, 2021). Specially, given an initial sentence-entity graph, we first introduce a graph-based classifier to predict pairwise orderings between linked sentences. Then, in an iterative manner, based on the graph updated by previously predicted high-confident pairwise orderings, another classifier is used to predict the remaining uncertain pairwise orderings. At last, we adapt a GRN-based sentence ordering model (Yin et al. 2019, 2021) on the basis of final graph. Experiments on five commonly-used datasets demonstrate the effectiveness and generality of our model. Particularly, when equipped with BERT (Devlin et al. 2019) and FHDecoder (Yin et al. 2020), our model achieves state-of-the-art performance. Our code is available at https://github.com/DeepLearnXMU/IRSEG.

2020

pdf bib
Structural Information Preserving for Graph-to-Text Generation
Linfeng Song | Ante Wang | Jinsong Su | Yue Zhang | Kun Xu | Yubin Ge | Dong Yu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs. As a crucial defect, the current state-of-the-art models may mess up or even drop the core structural information of input graphs when generating outputs. We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information. In particular, we introduce two types of autoencoding losses, each individually focusing on different aspects (a.k.a. views) of input graphs. The losses are then back-propagated to better calibrate our model via multi-task training. Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.