Robust Native Language Identification through Agentic Decomposition

Ahmet Yavuz Uluslu; Tannon Kew; Tilia Ellendorff; Gerold Schneider; Rico Sennrich

doi:10.18653/v1/2025.emnlp-main.423

Robust Native Language Identification through Agentic Decomposition

Ahmet Yavuz Uluslu, Tannon Kew, Tilia Ellendorff, Gerold Schneider, Rico Sennrich

Abstract

Large language models (LLMs) often achieve high performance in native language identification (NLI) benchmarks by leveraging superficial contextual clues such as names, locations, and cultural stereotypes, rather than the underlying linguistic patterns indicative of native language (L1) influence. To improve robustness, previous work has instructed LLMs to disregard such clues. In this work, we demonstrate that such a strategy is unreliable and model predictions can be easily altered by misleading hints. To address this problem, we introduce an agentic NLI pipeline inspired by forensic linguistics, where specialized agents accumulate and categorize diverse linguistic evidence before an independent final overall assessment. In this final assessment, a goal-aware coordinating agent synthesizes all evidence to make the NLI prediction. On two benchmark datasets, our approach significantly enhances NLI robustness against misleading contextual clues and performance consistency compared to standard prompting methods.

Anthology ID:: 2025.emnlp-main.423
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8398–8414
Language:
URL:: https://aclanthology.org/2025.emnlp-main.423/
DOI:: 10.18653/v1/2025.emnlp-main.423
Bibkey:
Cite (ACL):: Ahmet Yavuz Uluslu, Tannon Kew, Tilia Ellendorff, Gerold Schneider, and Rico Sennrich. 2025. Robust Native Language Identification through Agentic Decomposition. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 8398–8414, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Robust Native Language Identification through Agentic Decomposition (Uluslu et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.423.pdf
Checklist:: 2025.emnlp-main.423.checklist.pdf

PDF Cite Search Checklist Fix data