Xixian Liao


2023

pdf bib
The Impact of Familiarity on Naming Variation: A Study on Object Naming in Mandarin Chinese
Yunke He | Xixian Liao | Jialing Liang | Gemma Boleda
Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)

Different speakers often produce different names for the same object or entity (e.g., “woman” vs. “tourist” for a female tourist). The reasons behind variation in naming are not well understood. We create a Language and Vision dataset for Mandarin Chinese that provides an average of 20 names for 1319 naturalistic images, and investigate how familiarity with a given kind of object relates to the degree of naming variation it triggers across subjects. We propose that familiarity influences naming variation in two competing ways: increasing familiarity can either expand vocabulary, leading to higher variation, or promote convergence on conventional names, thereby reducing variation. We find evidence for both factors being at play. Our study illustrates how computational resources can be used to address research questions in Cognitive Science.

2021

pdf bib
Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution
Laura Aina | Xixian Liao | Gemma Boleda | Matthijs Westera
Proceedings of the 25th Conference on Computational Natural Language Learning

It is often posited that more predictable parts of a speaker’s meaning tend to be made less explicit, for instance using shorter, less informative words. Studying these dynamics in the domain of referring expressions has proven difficult, with existing studies, both psycholinguistic and corpus-based, providing contradictory results. We test the hypothesis that speakers produce less informative referring expressions (e.g., pronouns vs. full noun phrases) when the context is more informative about the referent, using novel computational estimates of referent predictability. We obtain these estimates training an existing coreference resolution system for English on a new task, masked coreference resolution, giving us a probability distribution over referents that is conditioned on the context but not the referring expression. The resulting system retains standard coreference resolution performance while yielding a better estimate of human-derived referent predictability than previous attempts. A statistical analysis of the relationship between model output and mention form supports the hypothesis that predictability affects the form of a mention, both its morphosyntactic type and its length.