Can ChatGPT Understand Causal Language in Science Claims?

Yuheun Kim, Lu Guo, Bei Yu, Yingya Li


Abstract
This study evaluated ChatGPT’s ability to understand causal language in science papers and news by testing its accuracy in a task of labeling the strength of a claim as causal, conditional causal, correlational, or no relationship. The results show that ChatGPT is still behind the existing fine-tuned BERT models by a large margin. ChatGPT also had difficulty understanding conditional causal claims mitigated by hedges. However, its weakness may be utilized to improve the clarity of human annotation guideline. Chain-of-Thoughts were faithful and helpful for improving prompt performance, but finding the optimal prompt is difficult with inconsistent results and the lack of effective method to establish cause-effect between prompts and outcomes, suggesting caution when generalizing prompt engineering results across tasks or models.
Anthology ID:
2023.wassa-1.33
Volume:
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Jeremy Barnes, Orphée De Clercq, Roman Klinger
Venue:
WASSA
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
379–389
Language:
URL:
https://aclanthology.org/2023.wassa-1.33
DOI:
10.18653/v1/2023.wassa-1.33
Bibkey:
Cite (ACL):
Yuheun Kim, Lu Guo, Bei Yu, and Yingya Li. 2023. Can ChatGPT Understand Causal Language in Science Claims?. In Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, pages 379–389, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Can ChatGPT Understand Causal Language in Science Claims? (Kim et al., WASSA 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.wassa-1.33.pdf
Video:
 https://aclanthology.org/2023.wassa-1.33.mp4