2024
pdf
bib
abs
Document-level Causal Relation Extraction with Knowledge-guided Binary Question Answering
Zimu Wang
|
Lei Xia
|
Wei Wang
|
Xinya Du
Findings of the Association for Computational Linguistics: EMNLP 2024
As an essential task in information extraction (IE), Event-Event Causal Relation Extraction (ECRE) aims to identify and classify the causal relationships between event mentions in natural language texts. However, existing research on ECRE has highlighted two critical challenges, including the lack of document-level modeling and causal hallucinations. In this paper, we propose a Knowledge-guided binary Question Answering (KnowQA) method with event structures for ECRE, consisting of two stages: Event Structure Construction and Binary Question Answering. We conduct extensive experiments under both zero-shot and fine-tuning settings with large language models (LLMs) on the MECI and MAVEN-ERE datasets. Experimental results demonstrate the usefulness of event structures on document-level ECRE and the effectiveness of KnowQA by achieving state-of-the-art on the MECI dataset. We observe not only the effectiveness but also the high generalizability and low inconsistency of our method, particularly when with complete event structures after fine-tuning the models.
pdf
bib
Using Pre-trained Language Model for Accurate ESG Prediction
Lei Xia
|
Mingming Yang
|
Qi Liu
Proceedings of the Eighth Financial Technology and Natural Language Processing and the 1st Agent AI for Scenario Planning
2023
pdf
bib
abs
Evaluating Reading Comprehension Exercises Generated by LLMs: A Showcase of ChatGPT in Education Applications
Changrong Xiao
|
Sean Xin Xu
|
Kunpeng Zhang
|
Yufang Wang
|
Lei Xia
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
The recent advancement of pre-trained Large Language Models (LLMs), such as OpenAI’s ChatGPT, has led to transformative changes across fields. For example, developing intelligent systems in the educational sector that leverage the linguistic capabilities of LLMs demonstrates a visible potential. Though researchers have recently explored how ChatGPT could possibly assist in student learning, few studies have applied these techniques to real-world classroom settings involving teachers and students. In this study, we implement a reading comprehension exercise generation system that provides high-quality and personalized reading materials for middle school English learners in China. Extensive evaluations of the generated reading passages and corresponding exercise questions, conducted both automatically and manually, demonstrate that the system-generated materials are suitable for students and even surpass the quality of existing human-written ones. By incorporating first-hand feedback and suggestions from experienced educators, this study serves as a meaningful pioneering application of ChatGPT, shedding light on the future design and implementation of LLM-based systems in the educational context.
2010
pdf
bib
abs
A Random Graph Walk based Approach to Computing Semantic Relatedness Using Knowledge from Wikipedia
Ziqi Zhang
|
Anna Lisa Gentile
|
Lei Xia
|
José Iria
|
Sam Chapman
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Determining semantic relatedness between words or concepts is a fundamental process to many Natural Language Processing applications. Approaches for this task typically make use of knowledge resources such as WordNet and Wikipedia. However, these approaches only make use of limited number of features extracted from these resources, without investigating the usefulness of combining various different features and their importance in the task of semantic relatedness. In this paper, we propose a random walk model based approach to measuring semantic relatedness between words or concepts, which seamlessly integrates various features extracted from Wikipedia to compute semantic relatedness. We empirically study the usefulness of these features in the task, and prove that by combining multiple features that are weighed according to their importance, our system obtains competitive results, and outperforms other systems on some datasets.
2009
pdf
bib
Too Many Mammals: Improving the Diversity of Automatically Recognized Terms
Ziqi Zhang
|
Lei Xia
|
Mark A. Greenwood
|
José Iria
Proceedings of the International Conference RANLP-2009
2008
pdf
bib
abs
An Approach to Modeling Heterogeneous Resources for Information Extraction
Lei Xia
|
José Iria
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper, we describe an approach that aims to model heterogeneous resources for information extraction. Document is modeled in graph representation that enables better understanding of multi-media document and its structure which ultimately could result better cross-media information extraction. We also describe our proposed algorithm that segment document-based on the document modeling approach we described in this paper.
2007
pdf
bib
WIT: Web People Search Disambiguation using Random Walks
José Iria
|
Lei Xia
|
Ziqi Zhang
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)