Zhuang Qiu

2024

Evaluating Grammatical Well-Formedness in Large Language Models: A Comparative Study with Human Judgments
Zhuang Qiu | Xufeng Duan | Zhenguang Cai
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Research in artificial intelligence has witnessed the surge of large language models (LLMs) demonstrating improved performance in various natural language processing tasks. This has sparked significant discussions about the extent to which large language models emulate human linguistic cognition and usage. This study delves into the representation of grammatical well-formedness in LLMs, which is a critical aspect of linguistic knowledge. In three preregistered experiments, we collected grammaticality judgment data for over 2400 English sentences with varying structures from ChatGPT and Vicuna, comparing them with human judgment data. The results reveal substantial alignment in the assessment of grammatical correctness between LLMs and human judgments, albeit with LLMs often showing more conservative judgments for grammatical correctness or incorrectness.

pdf bib

Large Language Models For Second Language English Writing Assessments: An Exploratory Comparison
Zhuang Qiu | Peizhi Yan | Zhenguang Cai
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

2023

pdf bib abs

Does ChatGPT Resemble Humans in Processing Implicatures?
Zhuang Qiu | Xufeng Duan | Zhenguang Cai
Proceedings of the 4th Natural Logic Meets Machine Learning Workshop

Recent advances in large language models (LLMs) and LLM-driven chatbots, such as ChatGPT, have sparked interest in the extent to which these artificial systems possess human-like linguistic abilities. In this study, we assessed ChatGPT’s pragmatic capabilities by conducting three preregistered experiments focused on its ability to compute pragmatic implicatures. The first experiment tested whether ChatGPT inhibits the computation of generalized conversational implicatures (GCIs) when explicitly required to process the text’s truth-conditional meaning. The second and third experiments examined whether the communicative context affects ChatGPT’s ability to compute scalar implicatures (SIs). Our results showed that ChatGPT did not demonstrate human-like flexibility in switching between pragmatic and semantic processing. Additionally, ChatGPT’s judgments did not exhibit the well-established effect of communicative context on SI rates.

2014

pdf bib

Guo1 and Guo2 in Chinese Temporal System
Zhuang Qiu | Qi Su
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing

Co-authors

Venues

Fix author