Yosef Ardhito Winatmoko
2023
The Risk and Opportunity of Data Augmentation and Translation for ESG News Impact Identification with Language Models
Yosef Ardhito Winatmoko
|
Ali Septiandri
Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing
This paper presents our findings in the ML-ESG-2 task, which focused on classifying a news snippet of various languages as “Risk” or “Opportunity” in the ESG (Environmental, Social, and Governance) context. We experimented with data augmentation and translation facilitated by Large Language Models (LLM). We found that augmenting the English dataset did not help to improve the performance. By fine-tuning RoBERTa models with the original data, we achieved the top position for the English and second place for the French task. In contrast, we could achieve comparable results on the French dataset by solely using the English translation, securing the third position for the French task with only marginal F1 differences to the second-place model.
2020
Knowing Right from Wrong: Should We Use More Complex Models for Automatic Short-Answer Scoring in Bahasa Indonesia?
Ali Akbar Septiandri
|
Yosef Ardhito Winatmoko
|
Ilham Firdausi Putra
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing
We compare three solutions to UKARA 1.0 challenge on automated short-answer scoring: single classical, ensemble classical, and deep learning. The task is to classify given answers to two questions, whether they are right or wrong. While recent development shows increasing model complexity to push the benchmark performances, they tend to be resource-demanding with mundane improvement. For the UKARA task, we found that bag-of-words and classical machine learning approaches can compete with ensemble models and Bi-LSTM model with pre-trained word2vec embedding from 200 million words. In this case, the single classical machine learning achieved less than 2% difference in F1 compared to the deep learning approach with 1/18 time for model training.
Search