Joshua Tint

2026

Human communication is often implicit, conveying tone, identity, and intent beyond literal meanings. While large language models have achieved strong performance on explicit tasks such as summarization and reasoning, their capacity for expressivity, or implicit communication, remains underexplored. We introduce ExpressivityBench, a framework for evaluating the expressivity of LLMs using information-theoretic communication models. Our approach quantifies how well LLM-generated text communicates target properties without explicit mention, across nine tasks spanning emotion, identity, and tone. To enable scalable and reproducible evaluation, we employ LLM-based graders validated against human judgments. Our results reveal that while models are adept at expressing affective content, they struggle with sociolinguistic signals, lagging behind human baselines. This study provides a necessary step to evaluate human-like implicit communication, with implications for applications such as education, mental health support, and socially-aware dialogue systems. We provide code and data for our benchmark alongside our paper.

2025

pdf bib abs

Guardrails, not Guidance: Understanding Responses to LGBTQ+ Language in Large Language Models
Joshua Tint
Proceedings of the Queer in AI Workshop

Language models have integrated themselves into many aspects of digital life, shaping everything from social media to translation. This paper investigates how large language models (LLMs) respond to LGBTQ+ slang and heteronormative language. Through two experiments, the study assesses the emotional content and the impact of queer slang on responses from models including GPT-3.5, GPT-4o, Llama2, Llama3, Gemma and Mistral. The findings reveal that heteronormative prompts can trigger safety mechanisms, leading to neutral or corrective responses, while LGBTQ+ slang elicits more negative emotions. These insights punctuate the need to provide equitable outcomes for minority slangs and argots, in addition to eliminating explicit bigotry from language models.

pdf bib abs

NLP Needs Diversity outside of ‘Diversity’
Joshua Tint
Findings of the Association for Computational Linguistics: EMNLP 2025

This position paper argues that recent progress with diversity in NLP is disproportionately concentrated on a small number of areas surrounding fairness. We further argue that this is the result of a number of incentives, biases, and barriers which come together to disenfranchise marginalized researchers in non-fairness fields, or to move them into fairness-related fields. We substantiate our claims with an investigation into the demographics of NLP researchers by subfield, using our research to support a number of recommendations for ensuring that all areas within NLP can become more inclusive and equitable. In particular, we highlight the importance of breaking down feedback loops that reinforce disparities, and the need to address geographical and linguistic barriers that hinder participation in NLP research.

pdf bib abs

PROOD: A Simple LLM Out-of-Distribution Guardrail Leveraging Response Semantics
Joshua Tint
Findings of the Association for Computational Linguistics: EMNLP 2025

Out-of-distribution (OOD) detection is a key safeguard for large language models, especially when they’re deployed in real-world applications. However, existing OOD methods often struggle with prompts that are deliberately obfuscated, context-dependent, or superficially benign—making it hard to distinguish between harmless queries and adversarial or dangerous ones. These methods typically assess prompts in isolation, missing important semantic cues from the model’s response. We introduce PROOD, prompt-response OOD detection, a framework that jointly analyzes LLM prompts *and their corresponding outputs* to improve semantic understanding. PROOD supports zero-shot multiclass detection using synthetic data generation and it offers a tunable probabilistic classification output. We validate PROOD on three challenging benchmarks—TrustLLM, OR-Bench, and AdvBench—where consistently outperforms prior OOD techniques, improving F1 scores by up to 6.3 points, from 0.871 to 0.934. Our results show that incorporating model responses enables more accurate, context-aware OOD detection in complex and adversarial prompt environments.

Co-authors

Aditya Taparia 1

Venues

Fix author