Vidhya Nataraj


2024

pdf bib
IMNTPU at ML-ESG-3: Transformer Language Models for Multi-Lingual ESG Impact Type and Duration Classification
Yu Han Kao | Vidhya Nataraj | Ting-Chi Wang | Yu-Jyun Zheng | Hsiao-Chuan Liu | Wen-Hsuan Liao | Chia-Tung Tsai | Min-Yuh Day
Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing @ LREC-COLING 2024

Our team participated in the multi-lingual Environmental, Social, and Governance (ESG) classification task, focusing on datasets in three languages: English, French, and Japanese. This study leverages Pre-trained Language Models (PLMs), with a particular emphasis on the Bidirectional Encoder Representations from Transformers (BERT) framework, to analyze sentence and document structures across these varied linguistic datasets. The team’s experimentation with diverse PLM-based network designs facilitated a nuanced comparative analysis within this multi-lingual context. For each language-specific dataset, different BERT-based transformer models were trained and evaluated. Notably, in the experimental results, the RoBERTa-Base model emerged as the most effective in official evaluation, particularly in the English dataset, achieving a micro-F1 score of 58.82 %, thereby demonstrating superior performance in classifying ESG impact levels. This research highlights the adaptability and effectiveness of PLMs in tackling the complexities of multi-lingual ESG classification tasks, underscoring the exceptional performance of the Roberta Base model in processing English-language data.