ESG-FTSE: A Corpus of News Articles with ESG Relevance Labels and Use Cases

Mariya Pavlova, Bernard Casey, Miaosen Wang


Abstract
We present ESG-FTSE, the first corpus comprised of news articles with Environmental, Social and Governance (ESG) relevance annotations. In recent years, investors and regulators have pushed ESG investing to the mainstream due to the urgency of climate change. This has led to the rise of ESG scores to evaluate an investment’s credentials as socially responsible. While demand for ESG scores is high, their quality varies wildly. Quantitative techniques can be applied to improve ESG scores, thus, responsible investing. To contribute to resource building for ESG and financial text mining, we pioneer the ESG-FTSE corpus. We further present the first of its kind ESG annotation schema. It has three levels: a binary classification (relevant versus irrelevant news articles), ESG classification (ESG-related news articles), and target company. Both supervised and unsupervised learning experiments for ESG relevance detection were conducted to demonstrate that the corpus can be used in different settings to derive accurate ESG predictions.
Anthology ID:
2024.finnlp-1.14
Volume:
Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Chung-Chi Chen, Xiaomo Liu, Udo Hahn, Armineh Nourbakhsh, Zhiqiang Ma, Charese Smiley, Veronique Hoste, Sanjiv Ranjan Das, Manling Li, Mohammad Ghassemi, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
Venues:
FinNLP | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
137–149
Language:
URL:
https://aclanthology.org/2024.finnlp-1.14
DOI:
Bibkey:
Cite (ACL):
Mariya Pavlova, Bernard Casey, and Miaosen Wang. 2024. ESG-FTSE: A Corpus of News Articles with ESG Relevance Labels and Use Cases. In Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing @ LREC-COLING 2024, pages 137–149, Torino, Italia. ELRA and ICCL.
Cite (Informal):
ESG-FTSE: A Corpus of News Articles with ESG Relevance Labels and Use Cases (Pavlova et al., FinNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.finnlp-1.14.pdf
Optional supplementary material:
 2024.finnlp-1.14.OptionalSupplementaryMaterial.zip