Wenjing Xu

2025

Improving Proficiency and Grammar Accuracy for Chinese Language Learners with Large Language Models
Yuqi Liang | Wenjing Xu | Hongzhi Xu
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

In this study, we evaluate the performance of large language models (LLMs) in detecting and correcting grammatical errors made by Chinese language learners. We find that incorporating various linguistic features—such as dependency structures, parts of speech, and pinyin transliteration—into the prompts can potentially enhance model performance. Among these features, parts of speech and pinyin prove to be the most effective across all tested models. Additionally, our findings show that the success of error correction also depends on the severity of the errors. When the intended meaning is preserved, LLMs tend to provide accurate revisions following the principle of minimal editing. However, when the meaning is obscured, LLMs are more likely to produce divergent outputs, both in comparison to reference corrections and to the responses of other models.

pdf bib abs

Evaluating Large Language Models for In-Context Learning of Linguistic Patterns In Unseen Low Resource Languages
Hongpu Zhu | Yuqi Liang | Wenjing Xu | Hongzhi Xu
Proceedings of the First Workshop on Language Models for Low-Resource Languages

This paper investigates the ability of Large language Models (LLMs) in capturing linguistic patterns from unseen languages and applying them to translation between the languages and English within an in-context learning framework. Inspired by the International Linguistics Olympiad (IOL), we create test data consisting of translation puzzles between 40 low resource languages and English. We test the LLMs in two different strategies: direct prompting and step-by-step prompting. In the latter, the puzzles are manually decomposed into intermediate steps to allow LLMs learn and apply linguistic rules incrementally. The results show that this strategy can significantly improve the performance of LLMs, achieving comparable or slightly superior results to humans when translating the unseen languages to English. However, LLMs still struggle with translating English into the unseen languages, typically with complex syntactic rules. We further observe that LLMs cannot deal with languages with object-subject and noun-adjective word order compared to others, reflecting the potential impact imposed by typological features of languages in training data.

Co-authors

Venues

Fix author