2024
pdf
bib
abs
Rethinking the Evaluation of In-Context Learning for LLMs
Guoxin Yu
|
Lemao Liu
|
Mo Yu
|
Yue Yu
|
Xiang Ao
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
In-context learning (ICL) has demonstrated excellent performance across various downstream NLP tasks, especially when synergized with powerful large language models (LLMs). Existing studies evaluate ICL methods primarily based on downstream task performance. This evaluation protocol overlooks the significant cost associated with the demonstration configuration process, i.e., tuning the demonstration as the ICL prompt. However, in this work, we point out that the evaluation protocol leads to unfair comparisons and potentially biased evaluation, because we surprisingly find the correlation between the configuration costs and task performance. Then we call for a two-dimensional evaluation paradigm that considers both of these aspects, facilitating a fairer comparison.Finally, based on our empirical finding that the optimized demonstration on one language model generalizes across language models of different sizes, we introduce a simple yet efficient strategy that can be applied to any ICL method as a plugin, yielding a better trade-off between the two dimensions according to the proposed evaluation paradigm.
pdf
bib
abs
EFSA: Towards Event-Level Financial Sentiment Analysis
Tianyu Chen
|
Yiming Zhang
|
Guoxin Yu
|
Dapeng Zhang
|
Li Zeng
|
Qing He
|
Xiang Ao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In this paper, we extend financial sentiment analysis (FSA) to event-level since events usually serve as the subject of the sentiment in financial text. Though extracting events from the financial text may be conducive to accurate sentiment predictions, it has specialized challenges due to the lengthy and discontinuity of events in a financial text. To this end, we reconceptualize the event extraction as a classification task by designing a categorization comprising coarse-grained and fine-grained event categories. Under this setting, we formulate the Event-Level Financial Sentiment Analysis(EFSA for short) task that outputs quintuples consisting of (company, industry, coarse-grained event, fine-grained event, sentiment) from financial text. A large-scale Chinese dataset containing 12,160 news articles and 13,725 quintuples is publicized as a brand new testbed for our task. A four-hop Chain-of-Thought LLM-based approach is devised for this task. Systematically investigations are conducted on our dataset, and the empirical results demonstrate the benchmarking scores of existing methods and our proposed method can reach the current state-of-the-art. Our dataset and framework implementation are available at https://github.com/cty1934/EFSA
2023
pdf
bib
abs
Making Better Use of Training Corpus: Retrieval-based Aspect Sentiment Triplet Extraction via Label Interpolation
Guoxin Yu
|
Lemao Liu
|
Haiyun Jiang
|
Shuming Shi
|
Xiang Ao
Findings of the Association for Computational Linguistics: ACL 2023
In this paper, we aim to adapt the idea of retrieval-based neural approaches to the Aspect Sentiment Triplet Extraction (ASTE) task. Different from previous studies retrieving semantic similar neighbors, the ASTE task has its specialized challenges when adapting, i.e., the purpose includes predicting the sentiment polarity and it is usually aspect-dependent. Semantic similar neighbors with different polarities will be infeasible even counterproductive. To tackle this issue, we propose a retrieval-based neural ASTE approach, named RLI (Retrieval-based Aspect Sentiment Triplet Extraction via Label Interpolation), which exploits the label information of neighbors. Given an aspect-opinion term pair, we retrieve semantic similar triplets from the training corpus and interpolate their label information into the augmented representation of the target pair. The retriever is jointly trained with the whole ASTE framework, and neighbors with both similar semantics and sentiments can be recalled with the aid of this distant supervision. In addition, we design a simple yet effective pre-train method for the retriever that implicitly encodes the label similarities. Extensive experiments and analysis on two widely-used benchmarks show that the proposed model establishes a new state-of-the-art on ASTE.
pdf
bib
abs
Retrieval-Augmented Few-shot Text Classification
Guoxin Yu
|
Lemao Liu
|
Haiyun Jiang
|
Shuming Shi
|
Xiang Ao
Findings of the Association for Computational Linguistics: EMNLP 2023
Retrieval-augmented methods are successful in the standard scenario where the retrieval space is sufficient; whereas in the few-shot scenario with limited retrieval space, this paper shows it is non-trivial to put them into practice. First, it is impossible to retrieve semantically similar examples by using an off-the-shelf metric and it is crucial to learn a task-specific retrieval metric; Second, our preliminary experiments demonstrate that it is difficult to optimize a plausible metric by minimizing the standard cross-entropy loss. The in-depth analyses quantitatively show minimizing cross-entropy loss suffers from the weak supervision signals and the severe gradient vanishing issue during the optimization. To address these issues, we introduce two novel training objectives, namely EM-L and R-L, which provide more task-specific guidance to the retrieval metric by the EM algorithm and a ranking-based loss, respectively. Extensive experiments on 10 datasets prove the superiority of the proposed retrieval augmented methods on the performance.
2021
pdf
bib
Making Flexible Use of Subtasks: A Multiplex Interaction Network for Unified Aspect-based Sentiment Analysis
Guoxin Yu
|
Xiang Ao
|
Ling Luo
|
Min Yang
|
Xiaofei Sun
|
Jiwei Li
|
Qing He
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
pdf
bib
abs
Self Question-answering: Aspect-based Sentiment Analysis by Role Flipped Machine Reading Comprehension
Guoxin Yu
|
Jiwei Li
|
Ling Luo
|
Yuxian Meng
|
Xiang Ao
|
Qing He
Findings of the Association for Computational Linguistics: EMNLP 2021
The pivot for the unified Aspect-based Sentiment Analysis (ABSA) is to couple aspect terms with their corresponding opinion terms, which might further derive easier sentiment predictions. In this paper, we investigate the unified ABSA task from the perspective of Machine Reading Comprehension (MRC) by observing that the aspect and the opinion terms can serve as the query and answer in MRC interchangeably. We propose a new paradigm named Role Flipped Machine Reading Comprehension (RF-MRC) to resolve. At its heart, the predicted results of either the Aspect Term Extraction (ATE) or the Opinion Terms Extraction (OTE) are regarded as the queries, respectively, and the matched opinion or aspect terms are considered as answers. The queries and answers can be flipped for multi-hop detection. Finally, every matched aspect-opinion pair is predicted by the sentiment classifier. RF-MRC can solve the ABSA task without any additional data annotation or transformation. Experiments on three widely used benchmarks and a challenging dataset demonstrate the superiority of the proposed framework.