The Bull and the Bear: Summarizing Stock Market Discussions

Ayush Kumar, Dhyey Jani, Jay Shah, Devanshu Thakar, Varun Jain, Mayank Singh


Abstract
Stock market investors debate and heavily discuss stock ideas, investing strategies, news and market movements on social media platforms. The discussions are significantly longer in length and require extensive domain expertise for understanding. In this paper, we curate such discussions and construct a first-of-its-kind of abstractive summarization dataset. Our curated dataset consists of 7888 Reddit posts and manually constructed summaries for 400 posts. We robustly evaluate the summaries and conduct experiments on SOTA summarization tools to showcase their limitations. We plan to make the dataset publicly available. The sample dataset is available here: https://dhyeyjani.github.io/RSMC
Anthology ID:
2022.lrec-1.746
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6909–6913
Language:
URL:
https://aclanthology.org/2022.lrec-1.746
DOI:
Bibkey:
Cite (ACL):
Ayush Kumar, Dhyey Jani, Jay Shah, Devanshu Thakar, Varun Jain, and Mayank Singh. 2022. The Bull and the Bear: Summarizing Stock Market Discussions. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6909–6913, Marseille, France. European Language Resources Association.
Cite (Informal):
The Bull and the Bear: Summarizing Stock Market Discussions (Kumar et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.746.pdf