Kaiwen Ou
2024
HW-TSC at SemEval-2024 Task 5: Self-Eval? A Confident LLM System for Auto Prediction and Evaluation for the Legal Argument Reasoning Task
Xiaofeng Zhao
|
Xiaosong Qiao
|
Kaiwen Ou
|
Min Zhang
|
Su Chang
|
Mengyao Piao
|
Yuang Li
|
Yinglu Li
|
Ming Zhu
|
Yilun Liu
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
In this article, we present an effective system for semeval-2024 task 5. The task involves assessing the feasibility of a given solution in civil litigation cases based on relevant legal provisions, issues, solutions, and analysis. This task demands a high level of proficiency in U.S. law and natural language reasoning. In this task, we designed a self-eval LLM system that simultaneously performs reasoning and self-assessment tasks. We created a confidence interval and a prompt instructing the LLM to output the answer to a question along with its confidence level. We designed a series of experiments to prove the effectiveness of the self-eval mechanism. In order to avoid the randomness of the results, the final result is obtained by voting on three results generated by the GPT-4. Our submission was conducted under zero-resource setting, and we achieved first place in the task with an F1-score of 0.8231 and an accuracy of 0.8673.
Search
Co-authors
- Xiaofeng Zhao 1
- Xiaosong Qiao 1
- Min Zhang 1
- Su Chang 1
- Mengyao Piao 1
- show all...