Phatthachdau at SemEval-2026 Task 9: A Multi-Stage Augment-Judge-Train Pipeline for Multilingual Online Polarization Detection

Phan Phat

Phatthachdau at SemEval-2026 Task 9: A Multi-Stage Augment-Judge-Train Pipeline for Multilingual Online Polarization Detection

Abstract

Address the extreme label imbalance in the Hausa dataset where only 11% of instances are polarized—through the Augment-Judge-Train (AJT) pipeline. By leveraging Gemini 2.0 for taxonomy-driven data generation and an LLM-as-a-Judge layer for quality control, we expanded the minority class sixfold. Our ensemble architecture, combining specialized Encoders with LLM-LORA, achieved 1st Place in Hausa (0.8336 Macro-F1) and ranked in the Top 10 for English. These results demonstrate the efficacy of culture-aware synthetic data in enhancing social NLP for low-resource languages.

Anthology ID:: 2026.semeval-1.208
Volume:: Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1616–1620
Language:
URL:: https://aclanthology.org/2026.semeval-1.208/
DOI:
Bibkey:
Cite (ACL):: Phan Phat. 2026. Phatthachdau at SemEval-2026 Task 9: A Multi-Stage Augment-Judge-Train Pipeline for Multilingual Online Polarization Detection. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 1616–1620, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Phatthachdau at SemEval-2026 Task 9: A Multi-Stage Augment-Judge-Train Pipeline for Multilingual Online Polarization Detection (Phat, SemEval 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.semeval-1.208.pdf

PDF Cite Search Fix data