Wei He


2024

pdf bib
LONGAGENT: Achieving Question Answering for 128k-Token-Long Documents through Multi-Agent Collaboration
Jun Zhao | Can Zu | Xu Hao | Yi Lu | Wei He | Yiwen Ding | Tao Gui | Qi Zhang | Xuanjing Huang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) have achieved tremendous success in understanding language and processing text. However, question-answering (QA) on lengthy documents faces challenges of resource constraints and a high propensity for errors, even for the most advanced models such as GPT-4 and Claude2.In this paper, we introduce _LongAgent_, a multi-agent collaboration method that enables efficient and effective QA over 128k-token-long documents. _LongAgent_ adopts a _divide-and-conquer_ strategy, breaking down lengthy documents into shorter, more manageable text chunks. A leader agent comprehends the user’s query and organizes the member agents to read their assigned chunks, reasoning a final answer through multiple rounds of discussion.Due to members’ hallucinations, it’s difficult to guarantee that every response provided by each member is accurate.To address this, we develop an _inter-member communication_ mechanism that facilitates information sharing, allowing for the detection and mitigation of hallucinatory responses.Experimental results show that a LLaMA-2 7B driven by _LongAgent_ can effectively support QA over 128k-token documents, achieving 16.42% and 1.63% accuracy gains over GPT-4 on single-hop and multi-hop QA settings, respectively.

pdf bib
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models
Wei He | Shichun Liu | Jun Zhao | Yiwen Ding | Yi Lu | Zhiheng Xi | Tao Gui | Qi Zhang | Xuanjing Huang
Findings of the Association for Computational Linguistics: NAACL 2024

Large language models (LLMs) have shown promising abilities of in-context learning (ICL), adapting swiftly to new tasks with only few-shot demonstrations. However, current few-shot methods heavily depend on high-quality, query-specific demos, which are often lacking. When faced with out-of-demonstration (OOD) queries, methods that rely on hand-crafted demos or external retrievers might fail. To bridge the gap between limited demos and OOD queries, we propose Self-Demos, a novel prompting method that elicits the inherent generalizability in LLMs by query-aware demo generation. The generated demos strategically interpolate between existing demos and the given query, transforming the query from OOD to ID. To evaluate the effectiveness of our approach, we manually constructed OOD-Toolset, a dataset in the tool-using scenario with over 300 real-world APIs and 1000 instances, each consisting of three tool-use cases as demos and an OOD query. Thorough experiments on our dataset and two public math benchmarks have shown that our method can outperform state-of-the-art baselines in the OOD setting. Moreover, we conduct a range of analyses to validate Self-Demos’s generalization and provide more insights.

pdf bib
Enhancing Idiomatic Representation in Multiple Languages via an Adaptive Contrastive Triplet Loss
Wei He | Marco Idiart | Carolina Scarton | Aline Villavicencio
Findings of the Association for Computational Linguistics: ACL 2024

Accurately modeling idiomatic or non-compositional language has been a longstanding challenge in Natural Language Processing (NLP). This is partly because these expressions do not derive their meanings solely from their constituent words, but also due to the scarcity of relevant data resources, and their impact on the performance of downstream tasks such as machine translation and simplification. In this paper we propose an approach to model idiomaticity effectively using a triplet loss that incorporates the asymmetric contribution of components words to an idiomatic meaning for training language models by using adaptive contrastive learning and resampling miners to build an idiomatic-aware learning objective. Our proposed method is evaluated on a SemEval challenge and outperforms previous alternatives significantly in many metrics.

pdf bib
LongHeads: Multi-Head Attention is Secretly a Long Context Processor
Yi Lu | Xin Zhou | Wei He | Jun Zhao | Tao Ji | Tao Gui | Qi Zhang | Xuanjing Huang
Findings of the Association for Computational Linguistics: EMNLP 2024

Large language models (LLMs) have achieved impressive performance in numerous domains but often struggle to process lengthy inputs effectively and efficiently due to limited length generalization and attention’s quadratic computational demands. Many sought to mitigate this by restricting the attention window within the pre-trained length. However, these methods introduce new issues such as ignoring the middle context and requiring additional training. To address these problems, we propose LongHeads, a training-free framework that enhances LLM’s long context ability by unlocking multi-head attention’s untapped potential. Instead of allowing each head to attend to the full sentence, which struggles with generalizing to longer sequences due to out-of-distribution (OOD) issues, we allow each head to process in-distribution length by selecting and attending to important context chunks. To this end, we propose a chunk selection strategy that relies on the inherent correlation between the query and the key representations, efficiently distributing context chunks to different heads. In this way, each head ensures it can effectively process attended tokens within the trained length, while different heads in different layers can collectively process longer contexts. LongHeads works efficiently and fits seamlessly with many LLMs that use relative positional encoding. LongHeads achieves 100% accuracy at the 128k length on passkey retrieval task, verifying LongHeads’ efficacy in extending the usable context window for existing models.

pdf bib
Trustworthiness and Self-awareness in Large Language Models: An Exploration through the Think-Solve-Verify Framework
Zhendong Liu | Changhong Xia | Wei He | Chongjun Wang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

As Large Language Models (LLMs) become increasingly influential in reasoning tasks, ensuring their trustworthiness and introspective self-awareness is critical. This research introduces the Think-Solve-Verify (TSV) framework, an innovative strategy tailored to explore LLMs’ trustworthiness, introspective self-awareness, and collaborative reasoning. This method accentuates a model’s capability to construct introspective reasoning processes from answers and ensure their trustworthiness. The reasoning with TSV consistently performs at or near the top across the majority of datasets with a single interaction with LLM. Moreover, we refine the voting process of self-consistency within the Chain-of-Thought (CoT) approach, leading to notable accuracy enhancements. In our evaluations, this approach improved performance from 67.3% to 72.8% on the AQuA dataset. Furthermore, we delve into the model’s ability to explain the given answers, highlighting the significance of discerning genuine comprehension from mere guesswork.

2018

pdf bib
Answer-focused and Position-aware Neural Question Generation
Xingwu Sun | Jing Liu | Yajuan Lyu | Wei He | Yanjun Ma | Shi Wang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In this paper, we focus on the problem of question generation (QG). Recent neural network-based approaches employ the sequence-to-sequence model which takes an answer and its context as input and generates a relevant question as output. However, we observe two major issues with these approaches: (1) The generated interrogative words (or question words) do not match the answer type. (2) The model copies the context words that are far from and irrelevant to the answer, instead of the words that are close and relevant to the answer. To address these two issues, we propose an answer-focused and position-aware neural question generation model. (1) By answer-focused, we mean that we explicitly model question word generation by incorporating the answer embedding, which can help generate an interrogative word matching the answer type. (2) By position-aware, we mean that we model the relative distance between the context words and the answer. Hence the model can be aware of the position of the context words when copying them to generate a question. We conduct extensive experiments to examine the effectiveness of our model. The experimental results show that our model significantly improves the baseline and outperforms the state-of-the-art system.

pdf bib
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification
Yizhong Wang | Kai Liu | Jing Liu | Wei He | Yajuan Lyu | Hua Wu | Sujian Li | Haifeng Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine. Compared with MRC on a single passage, multi-passage MRC is more challenging, since we are likely to get multiple confusing answer candidates from different passages. To address this problem, we propose an end-to-end neural model that enables those answer candidates from different passages to verify each other based on their content representations. Specifically, we jointly train three modules that can predict the final answer based on three factors: the answer boundary, the answer content and the cross-passage answer verification. The experimental results show that our method outperforms the baseline by a large margin and achieves the state-of-the-art performance on the English MS-MARCO dataset and the Chinese DuReader dataset, both of which are designed for MRC in real-world settings.

pdf bib
DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications
Wei He | Kai Liu | Jing Liu | Yajuan Lyu | Shiqi Zhao | Xinyan Xiao | Yuan Liu | Yizhong Wang | Hua Wu | Qiaoqiao She | Xuan Liu | Tian Wu | Haifeng Wang
Proceedings of the Workshop on Machine Reading for Question Answering

This paper introduces DuReader, a new large-scale, open-domain Chinese machine reading comprehension (MRC) dataset, designed to address real-world MRC. DuReader has three advantages over previous MRC datasets: (1) data sources: questions and documents are based on Baidu Search and Baidu Zhidao; answers are manually generated. (2) question types: it provides rich annotations for more question types, especially yes-no and opinion questions, that leaves more opportunity for the research community. (3) scale: it contains 200K questions, 420K answers and 1M documents; it is the largest Chinese MRC dataset so far. Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements. To help the community make these improvements, both DuReader and baseline systems have been posted online. We also organize a shared competition to encourage the exploration of more models. Since the release of the task, there are significant improvements over the baselines.

2016

pdf bib
Chinese Poetry Generation with Planning based Neural Network
Zhe Wang | Wei He | Hua Wu | Haiyang Wu | Wei Li | Haifeng Wang | Enhong Chen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Chinese poetry generation is a very challenging task in natural language processing. In this paper, we propose a novel two-stage poetry generating method which first plans the sub-topics of the poem according to the user’s writing intent, and then generates each line of the poem sequentially, using a modified recurrent neural network encoder-decoder framework. The proposed planning-based method can ensure that the generated poem is coherent and semantically consistent with the user’s intent. A comprehensive evaluation with human judgments demonstrates that our proposed approach outperforms the state-of-the-art poetry generating methods and the poem quality is somehow comparable to human poets.

pdf bib
Minimum Risk Training for Neural Machine Translation
Shiqi Shen | Yong Cheng | Zhongjun He | Wei He | Hua Wu | Maosong Sun | Yang Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Semi-Supervised Learning for Neural Machine Translation
Yong Cheng | Wei Xu | Zhongjun He | Wei He | Hua Wu | Maosong Sun | Yang Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Multi-Task Learning for Multiple Language Translation
Daxiang Dong | Hua Wu | Wei He | Dianhai Yu | Haifeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Rule-Based Weibo Messages Sentiment Polarity Classification towards Given Topics
Hongzhao Zhou | Yonglin Teng | Min Hou | Wei He | Hongtao Zhu | Xiaolin Zhu | Yanfei Mu
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing

2014

pdf bib
Improve Statistical Machine Translation with Context-Sensitive Bilingual Semantic Embedding Model
Haiyang Wu | Daxiang Dong | Xiaoguang Hu | Dianhai Yu | Wei He | Hua Wu | Haifeng Wang | Ting Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2012

pdf bib
Improve SMT Quality with Automatically Extracted Paraphrase Rules
Wei He | Hua Wu | Haifeng Wang | Ting Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2011

pdf bib
Enriching SMT Training Data via Paraphrasing
Wei He | Shiqi Zhao | Haifeng Wang | Ting Liu
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
HIT-CIR: An Unsupervised WSD System Based on Domain Most Frequent Sense Estimation
Yuhang Guo | Wanxiang Che | Wei He | Ting Liu | Sheng Li
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
CMDMC: A Diachronic Digital Museum of Chinese Mandarin
Min Hou | Yu Zou | Yonglin Teng | Wei He | Yan Wang | Jun Liu | Jiyuan Wu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

2009

pdf bib
Dependency Based Chinese Sentence Realization
Wei He | Haifeng Wang | Yuqing Guo | Ting Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2005

pdf bib
Automated Generalization of Phrasal Paraphrases from the Web
Weigang Li | Ting Liu | Yu Zhang | Sheng Li | Wei He
Proceedings of the Third International Workshop on Paraphrasing (IWP2005)