Julia Rayz

2025

pdf bib
Proceedings of the 1st Workshop on Computational Humor (CHum)
Christian F. Hempelmann | Julia Rayz | Tiansi Dong | Tristan Miller
Proceedings of the 1st Workshop on Computational Humor (CHum)

pdf bib abs
Hire Me or Not? Examining Language Model’s Behavior with Occupation Attributes
Damin Zhang | Yi Zhang | Geetanjali Bihani | Julia Rayz
Proceedings of the 31st International Conference on Computational Linguistics

With the impressive performance in various downstream tasks, large language models (LLMs) have been widely integrated into production pipelines, such as recruitment and recommendation systems. A known issue of models trained on natural language data is the presence of human biases, which can impact the fairness of the system. This paper investigates LLMs’ behavior with respect to gender stereotypes in the context of occupation decision making. Our framework is designed to investigate and quantify the presence of gender stereotypes in LLMs’ behavior via multi-round question answering. Inspired by prior work, we constructed a dataset using a standard occupation classification knowledge base released by authoritative agencies. We tested it on three families of LMs (RoBERTa, GPT, and Llama) and found that all models exhibit gender stereotypes analogous to human biases, but with different preferences. The distinct preferences of GPT-3.5-turbo and Llama2-70b-chat, along with additional analysis indicating GPT-4o-mini favors female subjects, may imply that the current alignment methods are insufficient for debiasing and could introduce new biases contradicting the traditional gender stereotypes. Our contribution includes a 73,500 prompts dataset constructed with a taxonomy of real-world occupations and a multi-step verification framework to evaluate model’s behavior regarding gender stereotype.

2024

pdf bib abs
Embodied Language Learning: Opportunities, Challenges, and Future Directions
Nadine Amin | Julia Rayz
Findings of the Association for Computational Linguistics: ACL 2024

While large language and vision-language models showcase impressive capabilities, they face a notable limitation: the inability to connect language with the physical world. To bridge this gap, research has focused on embodied language learning, where the language learner is situated in the world, perceives it, and interacts with it. This article explores the current standing of research in embodied language learning, highlighting opportunities and discussing common challenges. Lastly, it identifies existing gaps from the perspective of language understanding research within the embodied world and suggests potential future directions.

2023

pdf bib abs
COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models
Kanishka Misra | Julia Rayz | Allyson Ettinger
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

A characteristic feature of human semantic cognition is its ability to not only store and retrieve the properties of concepts observed through experience, but to also facilitate the inheritance of properties (can breathe) from superordinate concepts (animal) to their subordinates (dog)—i.e. demonstrate property inheritance. In this paper, we present COMPS, a collection of minimal pair sentences that jointly tests pre-trained language models (PLMs) on their ability to attribute properties to concepts and their ability to demonstrate property inheritance behavior. Analyses of 22 different PLMs on COMPS reveal that they can easily distinguish between concepts on the basis of a property when they are trivially different, but find it relatively difficult when concepts are related on the basis of nuanced knowledge representations. Furthermore, we find that PLMs can show behaviors suggesting successful property inheritance in simple contexts, but fail in the presence of distracting information, which decreases the performance of many models sometimes even below chance. This lack of robustness in demonstrating simple reasoning raises important questions about PLMs’ capacity to make correct inferences even when they appear to possess the prerequisite knowledge.

2021

pdf bib abs
Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations
Geetanjali Bihani | Julia Rayz
Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures

Contextual word representation models have shown massive improvements on a multitude of NLP tasks, yet their word sense disambiguation capabilities remain poorly explained. To address this gap, we assess whether contextual word representations extracted from deep pretrained language models create distinguishable representations for different senses of a given word. We analyze the representation geometry and find that most layers of deep pretrained language models create highly anisotropic representations, pointing towards the existence of representation degeneration problem in contextual word representations. After accounting for anisotropy, our study further reveals that there is variability in sense learning capabilities across different language models. Finally, we propose LASeR, a ‘Low Anisotropy Sense Retrofitting’ approach that renders off-the-shelf representations isotropic and semantically more meaningful, resolving the representation degeneration problem as a post-processing step, and conducting sense-enrichment of contextualized representations extracted from deep neural language models.

pdf bib
Implications of Using Internet Sting Corpora to Approximate Underage Victims
Tatiana Ringenberg | Kathryn Seigfried-Spellar | Julia Rayz
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib abs
Exploring BERT’s Sensitivity to Lexical Cues using Tests from Semantic Priming
Kanishka Misra | Allyson Ettinger | Julia Rayz
Findings of the Association for Computational Linguistics: EMNLP 2020

Models trained to estimate word probabilities in context have become ubiquitous in natural language processing. How do these models use lexical cues in context to inform their word probabilities? To answer this question, we present a case study analyzing the pre-trained BERT model with tests informed by semantic priming. Using English lexical stimuli that show priming in humans, we find that BERT too shows “priming”, predicting a word with greater probability when the context includes a related word versus an unrelated one. This effect decreases as the amount of information provided by the context increases. Follow-up analysis shows BERT to be increasingly distracted by related prime words as context becomes more informative, assigning lower probabilities to related words. Our findings highlight the importance of considering contextual constraint effects when studying word prediction in these models, and highlight possible parallels with human processing.

Co-authors

Christian F. Hempelmann 1

Tristan Miller 1

Tatiana Ringenberg 1

Kathryn Seigfried-Spellar 1

Damin Zhang 1

Yi Zhang 1

Venues

findings3
chum1
coling1
deelio1
eacl1
show all...

ws1

Fix author