Lisa Bylinina

2024

pdf bib abs
Are BabyLMs Second Language Learners?
Lukas Edman | Lisa Bylinina | Faeze Ghorbanpour | Alexander Fraser
The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning

This paper describes a linguistically-motivated approach to the 2024 edition of the BabyLM Challenge. Rather than pursuing a first language learning (L1) paradigm, we approach the challenge from a second language (L2) learning perspective. In L2 learning, there is a stronger focus on learning explicit linguistic information, such as grammatical notions, definitions of words or different ways of expressing a meaning. This makes L2 learning potentially more efficient and concise. We approximate this using data from Wiktionary, grammar examples either generated by an LLM or sourced from grammar books, and paraphrase data.We find that explicit information about word meaning (in our case, Wiktionary) does not boost model performance, while grammatical information can give a small improvement. The most impactful data ingredient is sentence paraphrases, with our two best models being trained on 1) a mix of paraphrase data and data from the BabyLM pretraining dataset, and 2) exclusively paraphrase data.

pdf bib abs
Individuation in Neural Models with and without Visual Grounding
Alexey Tikhonov | Lisa Bylinina | Ivan P. Yamshchikov
Proceedings of the 1st Workshop on NLP for Science (NLP4Science)

We show differences between a language-and-vision model CLIP and two text-only models — FastText and SBERT — when it comes to the encoding of individuation information. We study latent representations that CLIP provides for substrates, granular aggregates, and various numbers of objects. We demonstrate that CLIP embeddings capture quantitative differences in individuation better than models trained on text-only data. Moreover, the individuation hierarchy we deduce from the CLIP embeddings agrees with the hierarchies proposed in linguistics and cognitive science.

2023

pdf bib
Too Much Information: Keeping Training Simple for BabyLMs
Lukas Edman | Lisa Bylinina
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning

pdf bib abs
Connecting degree and polarity: An artificial language learning study
Lisa Bylinina | Alexey Tikhonov | Ekaterina Garmash
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We investigate a new linguistic generalisation in pre-trained language models (taking BERT Devlin et al. 2019 as a case study). We focus on degree modifiers (expressions like slightly, very, rather, extremely) and test the hypothesis that the degree expressed by a modifier (low, medium or high degree) is related to the modifier’s sensitivity to sentence polarity (whether it shows preference for affirmative or negative sentences or neither). To probe this connection, we apply the Artificial Language Learning experimental paradigm from psycholinguistics to a neural language model. Our experimental results suggest that BERT generalizes in line with existing linguistic observations that relate de- gree semantics to polarity sensitivity, including the main one: low degree semantics is associated with preference towards positive polarity.

pdf bib abs
Leverage Points in Modality Shifts: Comparing Language-only and Multimodal Word Representations
Alexey Tikhonov | Lisa Bylinina | Denis Paperno
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

Multimodal embeddings aim to enrich the semantic information in neural representations of language compared to text-only models. While different embeddings exhibit different applicability and performance on downstream tasks, little is known about the systematic representation differences attributed to the visual modality. Our paper compares word embeddings from three vision-and-language models (CLIP, OpenCLIP and Multilingual CLIP, Radford et al. 2021; Ilharco et al. 2021; Carlsson et al. 2022) and three text-only models, with static (FastText, Bojanowski et al. 2017) as well as contextual representations (multilingual BERT Devlin et al. 2018; XLM-RoBERTa, Conneau et al. 2019). This is the first large-scale study of the effect of visual grounding on language representations, including 46 semantic parameters. We identify meaning properties and relations that characterize words whose embeddings are most affected by the inclusion of visual modality in the training data; that is, points where visual grounding turns out most important. We find that the effect of visual modality correlates most with denotational semantic properties related to concreteness, but is also detected for several specific semantic classes, as well as for valence, a sentiment-related connotational property of linguistic expressions.

2022

pdf bib abs
Transformers in the loop: Polarity in neural models of language
Lisa Bylinina | Alexey Tikhonov
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Representation of linguistic phenomena in computational language models is typically assessed against the predictions of existing linguistic theories of these phenomena. Using the notion of polarity as a case study, we show that this is not always the most adequate set-up. We probe polarity via so-called ‘negative polarity items’ (in particular, English ‘any’) in two pre-trained Transformer-based models (BERT and GPT-2). We show that – at least for polarity – metrics derived from language models are more consistent with data from psycholinguistic experiments than linguistic theory predictions. Establishing this allows us to more adequately evaluate the performance of language models and also to use language models to discover new insights into natural language grammar beyond existing linguistic theories. This work contributes to establishing closer ties between psycholinguistic experiments and experiments with language models.

Co-authors

Denis Paperno 1

Ivan P. Yamshchikov 1

Venues

nlp4science1

starsem1

Fix author