Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Word order flexibility is one of the distinctive features of SOV languages. In this work, we investigate whether the order and relative distance of preverbal dependents in Hindi, an SOV language, is affected by factors motivated by efficiency considerations during comprehension/production. We investigate the influence of Head–Dependent Mutual Information (HDMI), similarity-based interference, accessibility and case-marking. Results show that preverbal dependents remain close to the verbal head when the HDMI between the verb and its dependent is high. This demonstrates the influence of locality constraints on dependency distance and word order in an SOV language. Additionally, dependency distance were found to be longer when the dependent was animate, when it was case-marked and when it was semantically similar to other preverbal dependents. Together the results highlight the crosslinguistic generalizability of these factors and provide evidence for a functionally motivated account of word order in SOV languages such as Hindi.
Different aspects of language processing have been shown to be sensitive to priming but the findings of studies examining priming effects in adolescents with Autism Spectrum Disorder (ASD) and Developmental Language Disorder (DLD) have been inconclusive. We present a study analysing visual and implicit semantic priming in adolescents with ASD and DLD. Based on a dataset of fictional and script-like narratives, we evaluate how often and how extensively, content of two different priming sources is used by the participants. The first priming source was visual, consisting of images shown to the participants to assist them with their storytelling. The second priming source originated from commonsense knowledge, using crowdsourced data containing prototypical script elements. Our results show that individuals with ASD are less sensitive to both types of priming, but show typical usage of primed cues when they use them at all. In contrast, children with DLD show mostly average priming sensitivity, but exhibit an over-proportional use of the priming cues.
We introduce a framework in which production-rule based computational cognitive modeling and Reinforcement Learning can systematically interact and inform each other. We focus on linguistic applications because the sophisticated rule-based cognitive models needed to capture linguistic behavioral data promise to provide a stringent test suite for RL algorithms, connecting RL algorithms to both accuracy and reaction-time experimental data. Thus, we open a path towards assembling an experimentally rigorous and cognitively realistic benchmark for RL algorithms. We extend our previous work on lexical decision tasks and tabular RL algorithms (Brasoveanu and Dotlačil, 2020b) with a discussion of neural-network based approaches, and a discussion of how parsing can be formalized as an RL problem.
Continuous vector word representations (or word embeddings) have shown success in capturing semantic relations between words, as evidenced with evaluation against behavioral data of adult performance on semantic tasks (Pereira et al. 2016). Adult semantic knowledge is the endpoint of a language acquisition process; thus, a relevant question is whether these models can also capture emerging word representations of young language learners. However, the data of semantic knowledge of children is scarce or non-existent for some age groups. In this paper, we propose to bridge this gap by using Age of Acquisition norms to evaluate word embeddings learnt from child-directed input. We present two methods that evaluate word embeddings in terms of (a) the semantic neighbourhood density of learnt words, and (b) the convergence to adult word associations. We apply our methods to bag-of-words models, and we find that (1) children acquire words with fewer semantic neighbours earlier, and (2) young learners only attend to very local context. These findings provide converging evidence for validity of our methods in understanding the prerequisite features for a distributional model of word learning.
The age of acquisition of a word is a psycholinguistic variable concerning the age at which a word is typically learned. It correlates with other psycholinguistic variables such as familiarity, concreteness, and imageability. Existing datasets for multiple languages also include linguistic variables such as the length and the frequency of lemmas in different corpora. There are substantial sets of normative values for English, but for other languages, such as Italian, the coverage is scarce. In this paper,a set of regression experiments investigates whether it is possible to guess the age of acquisition of Italian lemmas that have not been previously rated by humans. An intrinsic evaluation is proposed, correlating estimated Italian lemmas’ AoA with English lemmas’ AoA. An extrinsic evaluation - using AoA values as features for the classification of literary excerpts labeled by age appropriateness - shows how es-sential is lexical coverage for this task.
The free association task has been very influential both in cognitive science and in computational linguistics. However, little research has been done to study how free associations develop in childhood. The current work focuses on the developmental hypothesis according to which free word associations emerge by mirroring the co-occurrence distribution of children’s linguistic environment. I trained a distributional semantic model on a large corpus of child language and I tested if it could predict children’s responses. The results largely supported the hypothesis: Co-occurrence-based similarity was a strong predictor of children’s associative behavior even controlling for other possible predictors such as phonological similarity, word frequency, and word length. I discuss the findings in the light of theories of conceptual development.
Interactive alignment is a major mechanism of linguistic coordination. Here we study the way this mechanism emerges in development across the lexical, syntactic, and conceptual levels. We leverage NLP tools to analyze a large-scale corpus of child-adult conversations between 2 and 5 years old. We found that, across development, children align consistently to adults above chance and that adults align consistently more to children than vice versa (even controlling for language production abilities). Besides these consistencies, we found a diversity of developmental trajectories across linguistic levels. These corpus-based findings provide strong support for an early onset of multi-level linguistic alignment in children and invites new experimental work.
Grammatical gender is a consistent and informative cue to the plural class of German nouns. We find that neural encoder-decoder models learn to rely on this cue to predict plural class, but adult speakers are relatively insensitive to it. This suggests that the neural models are not an effective cognitive model of German plural formation.
Case is an abstract grammatical feature that indicates argument relationship in a sentence. In English, cases are expressed on pronouns, as nominative case (e.g. I, he), accusative case (e.g. me, him) and genitive case (e.g. my, his). Children correctly use cased pronouns at a very young age. How do they acquire abstract case in the first place, when different cases are not associated with different meanings? This paper proposes that the distributional patterns in parents’ input could be used to distinguish grammatical cases in English.
Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling
Yiding Hao | Simon Mendelsohn | Rachel Sterneck | Randi Martinez | Robert Frank
By positing a relationship between naturalistic reading times and information-theoretic surprisal, surprisal theory (Hale, 2001; Levy, 2008) provides a natural interface between language models and psycholinguistic models. This paper re-evaluates a claim due to Goodkind and Bicknell (2018) that a language model’s ability to model reading times is a linear function of its perplexity. By extending Goodkind and Bicknell’s analysis to modern neural architectures, we show that the proposed relation does not always hold for Long Short-Term Memory networks, Transformers, and pre-trained models. We introduce an alternate measure of language modeling performance called predictability norm correlation based on Cloze probabilities measured from human subjects. Our new metric yields a more robust relationship between language model quality and psycholinguistic modeling performance that allows for comparison between models with different training configurations.