Meiru Zhang
2024
Can We Instruct LLMs to Compensate for Position Bias?
Meiru Zhang
|
Zaiqiao Meng
|
Nigel Collier
Findings of the Association for Computational Linguistics: EMNLP 2024
Position bias in large language models (LLMs) leads to difficulty in accessing information retrieved from the retriever, thus downgrading the effectiveness of Retrieval-Augmented Generation (RAG) approaches in open-question answering. Recent studies reveal that this bias is related to disproportional attention across the context. In this work, we examine how to direct LLMs to allocate more attention towards a selected segment of the context through prompting, aiming to compensate for the shortage of attention. We find that language models do not have relative position awareness of the context but can be directed by promoting instruction with an exact document index. Our analysis contributes to a deeper understanding of position bias in LLMs and provides a pathway to mitigate this bias by instruction, thus benefiting LLMs in locating and utilizing relevant information from retrieved documents in RAG applications. The code and data in our study have been made publicly available.
2023
COFFEE: A Contrastive Oracle-Free Framework for Event Extraction
Meiru Zhang
|
Yixuan Su
|
Zaiqiao Meng
|
Zihao Fu
|
Nigel Collier
Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023)
Event extraction is a complex task that involves extracting events from unstructured text. Prior classification-based methods require comprehensive entity annotations for joint training, while newer generation-based methods rely on heuristic templates containing oracle information such as event type, which is often unavailable in real-world scenarios. In this study, we consider a more realistic task setting, namely the Oracle-Free Event Extraction (OFEE) task, where only the input context is given, without any oracle information including event type, event ontology, or trigger word. To address this task, we propose a new framework, COFFEE. This framework extracts events solely based on the document context, without referring to any oracle information. In particular, COFFEE introduces a contrastive selection model to refine the generated triggers and handle multi-event instances. Our proposed COFFEE outperforms state-of-the-art approaches in the oracle-free setting of the event extraction task, as evaluated on two public variants of the ACE05 benchmark. The code used in our study has been made publicly available.
Search