Giuseppe Russo
2024
Self-Recognition in Language Models
Tim R. Davidson
|
Viacheslav Surkov
|
Veniamin Veselovsky
|
Giuseppe Russo
|
Robert West
|
Caglar Gulcehre
Findings of the Association for Computational Linguistics: EMNLP 2024
A rapidly growing number of applications rely on a small set of closed-source language models (LMs). This dependency might introduce novel security risks if LMs develop self-recognition capabilities. Inspired by human identity verification methods, we propose a novel approach for assessing self-recognition in LMs using model-generated “security questions”. Our test can be externally administered to keep track of frontier models as it does not require access to internal model parameters or output probabilities. We use our test to examine self-recognition in ten of the most capable open- and closed-source LMs currently publicly available. Our extensive experiments found no empirical evidence of general or consistent self-recognition in any examined LM. Instead, our results suggest that given a set of alternatives, LMs seek to pick the “best” answer, regardless of its origin. Moreover, we find indications that preferences about which models produce the best answers are consistent across LMs. We additionally uncover novel insights on position bias considerations for LMs in multiple-choice settings.
2023
Helping a Friend or Supporting a Cause? Disentangling Active and Passive Cosponsorship in the U.S. Congress
Giuseppe Russo
|
Christoph Gote
|
Laurence Brandenberger
|
Sophia Schlosser
|
Frank Schweitzer
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In the U.S. Congress, legislators can use active and passive cosponsorship to support bills. We show that these two types of cosponsorship are driven by two different motivations: the backing of political colleagues and the backing of the bill’s content. To this end, we develop an Encoder+RGCN based model that learns legislator representations from bill texts and speech transcripts. These representations predict active and passive cosponsorship with an F1-score of 0.88.Applying our representations to predict voting decisions, we show that they are interpretable and generalize to unseen tasks.
2020
Control, Generate, Augment: A Scalable Framework for Multi-Attribute Text Generation
Giuseppe Russo
|
Nora Hollenstein
|
Claudiu Cristian Musat
|
Ce Zhang
Findings of the Association for Computational Linguistics: EMNLP 2020
We introduce CGA, a conditional VAE architecture, to control, generate, and augment text. CGA is able to generate natural English sentences controlling multiple semantic and syntactic attributes by combining adversarial learning with a context-aware loss and a cyclical word dropout routine. We demonstrate the value of the individual model components in an ablation study. The scalability of our approach is ensured through a single discriminator, independently of the number of attributes. We show high quality, diversity and attribute control in the generated sentences through a series of automatic and human assessments. As the main application of our work, we test the potential of this new NLG model in a data augmentation scenario. In a downstream NLP task, the sentences generated by our CGA model show significant improvements over a strong baseline, and a classification performance often comparable to adding same amount of additional real data.