CryptOpiQA: A new Opinion and Question Answering dataset on Cryptocurrency

Sougata Sarkar; Aditya Badwal; Amartya Roy; Koustav Rudra; Kripabandhu Ghosh

CryptOpiQA: A new Opinion and Question Answering dataset on Cryptocurrency

Sougata Sarkar, Aditya Badwal, Amartya Roy, Koustav Rudra, Kripabandhu Ghosh

Abstract

Cryptocurrency has attracted a lot of public attention and opinion worldwide. Users have different kinds of information needs regarding such topics and publicly available information is a good resource to satisfy those information needs. In this paper, we investigate the public opinion on cryptocurrency and bitcoin on two social media – Twitter and Reddit. We have created a multi-level dataset CryptOpiQA and garnered valuable insights. The dataset contains both gold standard (manually annotated) and silver standard (inferred from the gold standard) labels. As a part of this dataset, we have also created a Question Answering sub-corpus. We have used state-of-the-art LLMs and advanced techniques such as retrieval augmented generation (RAG) to improve question-answering (QnA) results. We believe this dataset and the analysis will be useful in studying user opinions and Question-Answering on cryptocurrency in the research community.

Anthology ID:: 2025.coling-main.736
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11107–11120
Language:
URL:: https://aclanthology.org/2025.coling-main.736/
DOI:
Bibkey:
Cite (ACL):: Sougata Sarkar, Aditya Badwal, Amartya Roy, Koustav Rudra, and Kripabandhu Ghosh. 2025. CryptOpiQA: A new Opinion and Question Answering dataset on Cryptocurrency. In Proceedings of the 31st International Conference on Computational Linguistics, pages 11107–11120, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: CryptOpiQA: A new Opinion and Question Answering dataset on Cryptocurrency (Sarkar et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.736.pdf

PDF Cite Search Fix data