Amanda Cercas Curry - ACL Anthology

Amanda Cercas Curry

Also published as: Amanda Cercas Curry

2026

Do Large Language Models Adapt to Language Variation across Socioeconomic Status?
Elisa Bassignana | Mike Zhang | Dirk Hovy | Amanda Cercas Curry
Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects

Humans adjust their linguistic style to the audience they are addressing. However, the extent to which LLMs adapt to different social contexts is largely unknown. As these models increasingly mediate human-to-human communication, their failure to adapt to diverse styles can perpetuate stereotypes and marginalize communities whose linguistic norms are less closely mirrored by the models, thereby reinforcing social stratification. We study the extent to which LLMs integrate into social media communication across different socioeconomic status (SES) communities. We collect a novel dataset from Reddit and YouTube, stratified by SES. We prompt four LLMs with incomplete text from that corpus and compare the LLM-generated completions to the originals along 94 sociolinguistic metrics, including syntactic, rhetorical, and lexical features. LLMs modulate their style with respect to SES to only a minor extent, often resulting in approximation or caricature, and tend to emulate the style of upper SES more effectively. Our findings (1) show how LLMs risk amplifying linguistic hierarchies and (2) call into question their validity for agent-based social simulation, survey experiments, and any research relying on language style as a social signal.

2025

Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks
Justin Zhao | Flor Miriam Plaza-del-Arco | Benjamin Genchel | Amanda Cercas Curry
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

As Large Language Models (LLMs) continue to evolve, evaluating them remains a persistent challenge. Many recent evaluations use LLMs as judges to score outputs from other LLMs, often relying on a single large model like GPT-4o. However, using a single LLM judge is prone to intra-model bias, and many tasks – such as those related to emotional intelligence, creative writing, and persuasiveness – may be too subjective for a single model to judge fairly. We introduce the Language Model Council (LMC), where a group of LLMs collaborate to create tests, respond to them, and evaluate each other’s responses to produce a ranking in a democratic fashion. Unlike previous approaches that focus on reducing cost or bias by using a panel of smaller models, our work examines the benefits and nuances of a fully inclusive LLM evaluation system. In a detailed case study on emotional intelligence, we deploy a council of 20 recent LLMs to rank each other on open-ended responses to interpersonal conflicts. Our results show that the LMC produces rankings that are more separable and more robust, and through a user study, we show that they are more consistent with human evaluations than any individual LLM judge. Using all LLMs for judging can be costly, however, so we use Monte Carlo simulations and hand-curated sub-councils to study hypothetical council compositions and discuss the value of the incremental LLM judge.

Seeing Race, Feeling Bias: Emotion Stereotyping in Multimodal Language Models
Mahammed Kamruzzaman | Amanda Cercas Curry | Alba Cercas Curry | Flor Miriam Plaza-del-Arco
Findings of the Association for Computational Linguistics: EMNLP 2025

Large language models (LLMs) are increasingly used to predict human emotions, but previous studies show that these models reproduce gendered emotion stereotypes. Emotion stereotypes are also tightly tied to race and skin tone (consider for example the trope of the angry black woman), but previous work has thus far overlooked this dimension. In this paper, we address this gap by introducing the first large-scale multimodal study of racial, gender, and skin-tone bias in emotion attribution, revealing how modality (text, images) and their combination shape emotion stereotypes in Multimodal LLMs (MLLMs). We evaluate four open-source MLLMs using 2.1K emotion-related events paired with 400 neutral face images across three different prompt strategies. Our findings reveal varying biases in MLLMs representations of different racial groups: models reproduce racial stereotypes across modalities, with textual cues being particularly noticeable. Models also reproduce colourist trends, with darker skin tones showing more skew. Our research highlights the need for future rigorous evaluation and mitigation strategies that account for race, colorism, and gender in MLLMs.

Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)
Su Lin Blodgett | Amanda Cercas Curry | Sunipa Dev | Siyan Li | Michael Madaio | Jack Wang | Sherry Tongshuang Wu | Ziang Xiao | Diyi Yang
Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)

Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate Speech.
Mikel Ngueajio | Flor Miriam Plaza-del-Arco | Yi-Ling Chung | Danda Rawat | Amanda Cercas Curry
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)

Automated counter-narratives (CN) offer a promising strategy for mitigating online hate speech, yet concerns about their affective tone, accessibility, and ethical risks remain. We propose a framework for evaluating Large Language Model (LLM)-generated CNs across four dimensions: persona framing, verbosity and readability, affective tone, and ethical robustness. Using GPT-4o-Mini, Cohere’s CommandR-7B, and Meta’s LLaMA 3.1-70B, we assess three prompting strategies on the MT-Conan and HatEval datasets.Our findings reveal that LLM-generated CNs are often verbose and adapted for people with college-level literacy, limiting their accessibility. While emotionally guided prompts yield more empathetic and readable responses, there remain concerns surrounding safety and effectiveness.

Blue-haired, misandriche, rabiata: Tracing the Connotation of ‘Feminist(s)’ Across Time, Languages and Domains
Arianna Muti | Sara Gemelli | Emanuele Moscato | Emilie Francis | Amanda Cercas Curry | Flor Miriam Plaza-del-Arco | Debora Nozza
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)

Understanding how words shift in meaning is crucial for analyzing societal attitudes.In this study, we investigate the contextual variations of the terms feminist, feminists along three axes: time, language, and domain.To this aim, we collect and release FEMME, a dataset comprising the occurrences of such terms from 2014 to 2023 in English, Italian and Swedish in Twitter, Reddit and Incel domains.Our methodology leverages frame analysis, as well as fine-tuning and LLMs. We find that the connotation of the plural form feminists is consistently more negative than feminist, indicating more hostility towards feminists as a collective, which often triggers greater societal pushback, reflecting broader patterns of group-based hostility and stigma. Across languages, we observe similar stereotypes towards feminists that often include body shaming, as well as accusations of hypocrisy and irrational behavior. In terms of time, we identify events that trigger a peak in terms of negative or positive connotation.As expected, the Incel spheres show predominantly negative connotations, while the general domains show mixed connotations.

Your Mileage May Vary: How Empathy and Demographics Shape Human Preferences in LLM Responses
Yishan Wang | Amanda Cercas Curry | Flor Miriam Plaza-del-Arco
Findings of the Association for Computational Linguistics: EMNLP 2025

As large language models (LLMs) increasingly assist in subjective decision-making (e.g., moral reasoning, advice), it is critical to understand whose preferences they align with—and why. While prior work uses aggregate human judgments, demographic variation and its linguistic drivers remain underexplored. We present a comprehensive analysis of how demographic background and empathy level correlate with preferences for LLM-generated dilemma responses, alongside a systematic study of predictive linguistic features (e.g., agency, emotional tone). Our findings reveal significant demographic divides and identify markers (e.g., power verbs, tentative phrasing) that predict group-level differences. These results underscore the need for demographically informed LLM evaluation.

The AI Gap: How Socioeconomic Status Affects Language Technology Interactions
Elisa Bassignana | Amanda Cercas Curry | Dirk Hovy
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Socioeconomic status (SES) fundamentally influences how people interact with each other and, more recently, with digital technologies like large language models (LLMs). While previous research has highlighted the interaction between SES and language technology, it was limited by reliance on proxy metrics and synthetic data. We survey 1,000 individuals from ‘diverse socioeconomic backgrounds’ about their use of language technologies and generative AI, and collect 6,482 prompts from their previous interactions with LLMs. We find systematic differences across SES groups in language technology usage (i.e., frequency, performed tasks), interaction styles, and topics. Higher SES entail a higher level of abstraction, convey requests more concisely, and topics like ‘inclusivity’ and ‘travel’. Lower SES correlates with higher anthropomorphization of LLMs (using ”hello” and ”thank you”) and more concrete language. Our findings suggest that while generative language technologies are becoming more accessible to everyone, socioeconomic linguistic differences still stratify their use to create a digital divide. These differences underscore the importance of considering SES in developing language technologies to accommodate varying linguistic needs rooted in socioeconomic factors and limit the AI Gap across SES groups.

Consistency is Key: Disentangling Label Variation in Natural Language Processing with Intra-Annotator Agreement
Gavin Abercrombie | Tanvi Dinkar | Amanda Cercas Curry | Verena Rieser | Dirk Hovy
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP

We commonly use agreement measures to assess the utility of judgements made by human annotators in Natural Language Processing (NLP) tasks. While inter-annotator agreement is frequently used as an indication of label reliability by measuring consistency between annotators, we argue for the additional use of intra-annotator agreement to measure label stability (and annotator consistency) over time. However, in a systematic review, we find that the latter is rarely reported in this field. Calculating these measures can act as important quality control and could provide insights into why annotators disagree. We conduct exploratory annotation experiments to investigate the relationships between these measures and perceptions of subjectivity and ambiguity in text items, finding that annotators provide inconsistent responses around 25% of the time across four different NLP tasks.

2024

Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language Models
Flor Miriam Plaza-del-Arco | Amanda Cercas Curry | Susanna Paoli | Alba Cercas Curry | Dirk Hovy
Findings of the Association for Computational Linguistics: EMNLP 2024

Emotions play important epistemological and cognitive roles in our lives, revealing our values and guiding our actions. Previous work has shown that LLMs display biases in emotion attribution along gender lines. However, unlike gender, which says little about our values, religion, as a socio-cultural system, prescribes a set of beliefs and values for its followers. Religions, therefore, cultivate certain emotions. Moreover, these rules are explicitly laid out and interpreted by religious leaders. Using emotion attribution, we explore how different religions are represented in LLMs. We find that:Major religions in the US and European countries are represented with more nuance, displaying a more shaded model of their beliefs.Eastern religions like Hinduism and Buddhism are strongly stereotyped.Judaism and Islam are stigmatized – the models’ refusal skyrocket. We ascribe these to cultural bias in LLMs and the scarcity of NLP literature on religion. In the rare instances where religion is discussed, it is often in the context of toxic language, perpetuating the perception of these religions as inherently toxic. This finding underscores the urgent need to address and rectify these biases. Our research emphasizes the crucial role emotions play in shaping our lives and how our values influence them.

Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution
Flor Miriam Plaza-del-Arco | Amanda Cercas Curry | Alba Curry | Gavin Abercrombie | Dirk Hovy
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large language models (LLMs) reflect societal norms and biases, especially about gender. While societal biases and stereotypes have been extensively researched in various NLP applications, there is a surprising gap for emotion analysis. However, emotion and gender are closely linked in societal discourse. E.g., women are often thought of as more empathetic, while men’s anger is more socially accepted. To fill this gap, we present the first comprehensive study of gendered emotion attribution in five state-of-the-art LLMs (open- and closed-source). We investigate whether emotions are gendered, and whether these variations are based on societal stereotypes. We prompt the models to adopt a gendered persona and attribute emotions to an event like ‘When I had a serious argument with a dear person’. We then analyze the emotions generated by the models in relation to the gender-event pairs. We find that all models consistently exhibit gendered emotions, influenced by gender stereotypes. These findings are in line with established research in psychology and gender studies. Our study sheds light on the complex societal interplay between language, gender, and emotion. The reproduction of emotion stereotypes in LLMs allows us to use those models to study the topic in detail, but raises questions about the predictive use of those same LLMs for emotion applications.

Emotion Analysis in NLP: Trends, Gaps and Roadmap for Future Directions
Flor Miriam Plaza-del-Arco | Alba A. Cercas Curry | Amanda Cercas Curry | Dirk Hovy
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Emotions are a central aspect of communication. Consequently, emotion analysis (EA) is a rapidly growing field in natural language processing (NLP). However, there is no consensus on scope, direction, or methods. In this paper, we conduct a thorough review of 154 relevant NLP publications from the last decade. Based on this review, we address four different questions: (1) How are EA tasks defined in NLP? (2) What are the most prominent emotion frameworks and which emotions are modeled? (3) Is the subjectivity of emotions considered in terms of demographics and cultural factors? and (4) What are the primary NLP applications for EA? We take stock of trends in EA and tasks, emotion frameworks used, existing datasets, methods, and applications. We then discuss four lacunae: (1) the absence of demographic and cultural aspects does not account for the variation in how emotions are perceived, but instead assumes they are universally experienced in the same manner; (2) the poor fit of emotion categories from the two main emotion theories to the task; (3) the lack of standardized EA terminology hinders gap identification, comparison, and future goals; and (4) the absence of interdisciplinary research isolates EA from insights in other fields. Our work will enable more focused research into EA and a more holistic approach to modeling emotions in NLP.

Classist Tools: Social Class Correlates with Performance in NLP
Amanda Cercas Curry | Giuseppe Attanasio | Zeerak Talat | Dirk Hovy
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The field of sociolinguistics has studied factors affecting language use for the last century. Labov (1964) and Bernstein (1960) showed that socioeconomic class strongly influences our accents, syntax and lexicon. However, despite growing concerns surrounding fairness and bias in Natural Language Processing (NLP), there is a dearth of studies delving into the effects it may have on NLP systems. We show empirically that NLP systems’ performance is affected by speakers’ SES, potentially disadvantaging less-privileged socioeconomic groups. We annotate a corpus of 95K utterances from movies with social class, ethnicity and geographical language variety and measure the performance of NLP systems on three tasks: language modelling, automatic speech recognition, and grammar error correction. We find significant performance disparities that can be attributed to socioeconomic status as well as ethnicity and geographical differences. With NLP technologies becoming ever more ubiquitous and quotidian, they must accommodate all language varieties to avoid disadvantaging already marginalised groups. We argue for the inclusion of socioeconomic class in future language technologies.

Subjective Isms? On the Danger of Conflating Hate and Offence in Abusive Language Detection
Amanda Cercas Curry | Gavin Abercrombie | Zeerak Talat
Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024)

Natural language processing research has begun to embrace the notion of annotator subjectivity, motivated by variations in labelling. This approach understands each annotator’s view as valid, which can be highly suitable for tasks that embed subjectivity, e.g., sentiment analysis. However, this construction may be inappropriate for tasks such as hate speech detection, as it affords equal validity to all positions on e.g., sexism or racism. We argue that the conflation of hate and offence can invalidate findings on hate speech, and call for future work to be situated in theory, disentangling hate from its orthogonal concept, offence.

Impoverished Language Technology: The Lack of (Social) Class in NLP
Amanda Cercas Curry | Zeerak Talat | Dirk Hovy
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Since Labov’s foundational 1964 work on the social stratification of language, linguistics has dedicated concerted efforts towards understanding the relationships between socio-demographic factors and language production and perception. Despite the large body of evidence identifying significant relationships between socio-demographic factors and language production, relatively few of these factors have been investigated in the context of NLP technology. While age and gender are well covered, Labov’s initial target, socio-economic class, is largely absent. We survey the existing Natural Language Processing (NLP) literature and find that only 20 papers even mention socio-economic status. However, the majority of those papers do not engage with class beyond collecting information of annotator-demographics. Given this research lacuna, we provide a definition of class that can be operationalised by NLP researchers, and argue for including socio-economic class in future language technologies.

Proceedings of Safety4ConvAI: The Third Workshop on Safety for Conversational AI @ LREC-COLING 2024
Tanvi Dinkar | Giuseppe Attanasio | Amanda Cercas Curry | Ioannis Konstas | Dirk Hovy | Verena Rieser
Proceedings of Safety4ConvAI: The Third Workshop on Safety for Conversational AI @ LREC-COLING 2024

Proceedings of the Third Workshop on Bridging Human--Computer Interaction and Natural Language Processing
Su Lin Blodgett | Amanda Cercas Curry | Sunipa Dev | Michael Madaio | Ani Nenkova | Diyi Yang | Ziang Xiao
Proceedings of the Third Workshop on Bridging Human--Computer Interaction and Natural Language Processing

2023

Computer says “No”: The Case Against Empathetic Conversational AI
Alba Curry | Amanda Cercas Curry
Findings of the Association for Computational Linguistics: ACL 2023

Emotions are an integral part of human cognition and they guide not only our understanding of the world but also our actions within it. As such, whether we soothe or flame an emotion is not inconsequential. Recent work in conversational AI has focused on responding empathetically to users, validating and soothing their emotions without a real basis. This AI-aided emotional regulation can have negative consequences for users and society, tending towards a one-noted happiness defined as only the absence of “negative” emotions. We argue that we must carefully consider whether and how to respond to users’ emotions.

MilaNLP at SemEval-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection
Amanda Cercas Curry | Giuseppe Attanasio | Debora Nozza | Dirk Hovy
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

We present the system proposed by the MilaNLP team for the Explainable Detection of Online Sexism (EDOS) shared task. We propose an ensemble modeling approach to combine different classifiers trained with domain adaptation objectives and standard fine-tuning. Our results show that the ensemble is more robust than individual models and that regularized models generate more “conservative” predictions, mitigating the effects of lexical overfitting.However, our error analysis also finds that many of the misclassified instances are debatable, raising questions about the objective annotatability of hate speech data.

Mirages. On Anthropomorphism in Dialogue Systems
Gavin Abercrombie | Amanda Cercas Curry | Tanvi Dinkar | Verena Rieser | Zeerak Talat
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Automated dialogue or conversational systems are anthropomorphised by developers and personified by users. While a degree of anthropomorphism is inevitable, conscious and unconscious design choices can guide users to personify them to varying degrees. Encouraging users to relate to automated systems as if they were human can lead to transparency and trust issues, and high risk scenarios caused by over-reliance on their outputs. As a result, natural language processing researchers have investigated the factors that induce personification and develop resources to mitigate such effects. However, these efforts are fragmented, and many aspects of anthropomorphism have yet to be explored. In this paper, we discuss the linguistic factors that contribute to the anthropomorphism of dialogue systems and the harms that can arise thereof, including reinforcing gender stereotypes and conceptions of acceptable language. We recommend that future efforts towards developing dialogue systems take particular care in their design, development, release, and description; and attend to the many linguistic cues that can elicit personification by users.

2021

ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI
Amanda Cercas Curry | Gavin Abercrombie | Verena Rieser
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

We present the first English corpus study on abusive language towards three conversational AI systems gathered ‘in the wild’: an open-domain social bot, a rule-based chatbot, and a task-based system. To account for the complexity of the task, we take a more ‘nuanced’ approach where our ConvAI dataset reflects fine-grained notions of abuse, as well as views from multiple expert annotators. We find that the distribution of abuse is vastly different compared to other commonly used datasets, with more sexually tinted aggression towards the virtual persona of these systems. Finally, we report results from bench-marking existing models against this data. Unsurprisingly, we find that there is substantial room for improvement with F1 scores below 90%.

Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants
Gavin Abercrombie | Amanda Cercas Curry | Mugdha Pandya | Verena Rieser
Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing

Technology companies have produced varied responses to concerns about the effects of the design of their conversational AI systems. Some have claimed that their voice assistants are in fact not gendered or human-like—despite design features suggesting the contrary. We compare these claims to user perceptions by analysing the pronouns they use when referring to AI assistants. We also examine systems’ responses and the extent to which they generate output which is gendered and anthropomorphic. We find that, while some companies appear to be addressing the ethical concerns raised, in some cases, their claims do not seem to hold true. In particular, our results show that system outputs are ambiguous as to the humanness of the systems, and that users tend to personify and gender them as a result.

2020

Conversational Assistants and Gender Stereotypes: Public Perceptions and Desiderata for Voice Personas
Amanda Cercas Curry | Judy Robertson | Verena Rieser
Proceedings of the Second Workshop on Gender Bias in Natural Language Processing

Conversational voice assistants are rapidly developing from purely transactional systems to social companions with “personality”. UNESCO recently stated that the female and submissive personality of current digital assistants gives rise for concern as it reinforces gender stereotypes. In this work, we present results from a participatory design workshop, where we invite people to submit their preferences for a what their ideal persona might look like, both in drawings as well as in a multiple choice questionnaire. We find no clear consensus which suggests that one possible solution is to let people configure/personalise their assistants. We then outline a multi-disciplinary project of how we plan to address the complex question of gender and stereotyping in digital assistants.

2019

A Crowd-based Evaluation of Abuse Response Strategies in Conversational Agents
Amanda Cercas Curry | Verena Rieser
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

How should conversational agents respond to verbal abuse through the user? To answer this question, we conduct a large-scale crowd-sourced evaluation of abuse response strategies employed by current state-of-the-art systems. Our results show that some strategies, such as “polite refusal”, score highly across the board, while for other strategies demographic factors, such as age, as well as the severity of the preceding abuse influence the user’s perception of which response is appropriate. In addition, we find that most data-driven models lag behind rule-based or commercial systems in terms of their perceived appropriateness.

2018

#MeToo Alexa: How Conversational Systems Respond to Sexual Harassment
Amanda Cercas Curry | Verena Rieser
Proceedings of the Second ACL Workshop on Ethics in Natural Language Processing

Conversational AI systems, such as Amazon’s Alexa, are rapidly developing from purely transactional systems to social chatbots, which can respond to a wide variety of user requests. In this article, we establish how current state-of-the-art conversational systems react to inappropriate requests, such as bullying and sexual harassment on the part of the user, by collecting and analysing the novel #MeTooAlexa corpus. Our results show that commercial systems mainly avoid answering, while rule-based chatbots show a variety of behaviours and often deflect. Data-driven systems, on the other hand, are often non-coherent, but also run the risk of being interpreted as flirtatious and sometimes react with counter-aggression. This includes our own system, trained on “clean” data, which suggests that inappropriate system behaviour is not caused by data bias.

2017

Why We Need New Evaluation Metrics for NLG
Jekaterina Novikova | Ondřej Dušek | Amanda Cercas Curry | Verena Rieser
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

The majority of NLG evaluation relies on automatic metrics, such as BLEU . In this paper, we motivate the need for novel, system- and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG. We also show that metric performance is data- and system-specific. Nevertheless, our results also suggest that automatic metrics perform reliably at system-level and can support system development by finding cases where a system performs poorly.

2015

Generating and Evaluating Landmark-Based Navigation Instructions in Virtual Environments
Amanda Cercas Curry | Dimitra Gkatzia | Verena Rieser
Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)

A Game-Based Setup for Data Collection and Task-Based Evaluation of Uncertain Information Presentation
Dimitra Gkatzia | Amanda Cercas Curry | Verena Rieser | Oliver Lemon
Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)

Venues

NLPerspectives1