Zixin Tang
2024
Learning to Write Rationally: How Information Is Distributed in Non-native Speakers’ Essays
Zixin Tang
|
Janet G. van Hell
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
People tend to distribute information evenly in language production for better and clearer communication. In this study, we compared essays written by second language (L2) learners with various native language (L1) backgrounds to investigate how they distribute information in their non-native L2 production. Analyses of surprisal and constancy of entropy rate indicated that writers with higher L2 proficiency can reduce the expected uncertainty of language production while still conveying informative content. However, the uniformity of information distribution showed less variability among different groups of L2 speakers, suggesting that this feature may be universal in L2 essay writing and less affected by L2 writers’ variability in L1 background and L2 proficiency.
2021
Are BERTs Sensitive to Native Interference in L2 Production?
Zixin Tang
|
Prasenjit Mitra
|
David Reitter
Proceedings of the Second Workshop on Insights from Negative Results in NLP
With the essays part from The International Corpus Network of Asian Learners of English (ICNALE) and the TOEFL11 corpus, we fine-tuned neural language models based on BERT to predict English learners’ native languages. Results showed neural models can learn to represent and detect such native language impacts, but multilingually trained models have no advantage in doing so.