2024
pdf
bib
abs
EU DisinfoTest: a Benchmark for Evaluating Language Models’ Ability to Detect Disinformation Narratives
Witold Sosnowski
|
Arkadiusz Modzelewski
|
Kinga Skorupska
|
Jahna Otterbacher
|
Adam Wierzbicki
Findings of the Association for Computational Linguistics: EMNLP 2024
As narratives shape public opinion and influence societal actions, distinguishing between truthful and misleading narratives has become a significant challenge. To address this, we introduce the EU DisinfoTest, a novel benchmark designed to evaluate the efficacy of Language Models in identifying disinformation narratives. Developed through a Human-in-the-Loop methodology and grounded in research from EU DisinfoLab, the EU DisinfoTest comprises more than 1,300 narratives. Our benchmark includes persuasive elements under Logos, Pathos, and Ethos rhetorical dimensions. We assessed state-of-the-art LLMs, including the newly released GPT-4o, on their capability to perform zero-shot classification of disinformation narratives versus credible narratives. Our findings reveal that LLMs tend to regard narratives with authoritative appeals as trustworthy, while those with emotional appeals are frequently incorrectly classified as disinformative. These findings highlight the challenges LLMs face in nuanced content interpretation and suggest the need for tailored adjustments in LLM training to better handle diverse narrative structures.
2016
pdf
bib
abs
Social and linguistic behavior and its correlation to trait empathy
Marina Litvak
|
Jahna Otterbacher
|
Chee Siang Ang
|
David Atkins
Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES)
A growing body of research exploits social media behaviors to gauge psychological character-istics, though trait empathy has received little attention. Because of its intimate link to the abil-ity to relate to others, our research aims to predict participants’ levels of empathy, given their textual and friending behaviors on Facebook. Using Poisson regression, we compared the vari-ance explained in Davis’ Interpersonal Reactivity Index (IRI) scores on four constructs (em-pathic concern, personal distress, fantasy, perspective taking), by two classes of variables: 1) post content and 2) linguistic style. Our study lays the groundwork for a greater understanding of empathy’s role in facilitating interactions on social media.
2008
pdf
bib
abs
Modeling Document Dynamics: an Evolutionary Approach
Jahna Otterbacher
|
Dragomir Radev
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
News articles about the same event published over time have properties that challenge NLP and IR applications. A cluster of such texts typically exhibits instances of paraphrase and contradiction, as sources update the facts surrounding the story, often due to an ongoing investigation. The current hypothesis is that the stories evolve over time, beginning with the first text published on a given topic. This is tested using a phylogenetic approach as well as one based on language modeling. The fit of the evolutionary models is evaluated with respect to how well they facilitate the recovery of chronological relationships between the documents. Over all data clusters, the language modeling approach consistently outperforms the phylogenetics model. However, on manually collected clusters in which the documents are published within short time spans of one another, both have a similar performance, and produce statistically significant results on the document chronology recovery evaluation.
2005
pdf
bib
Using Random Walks for Question-focused Sentence Retrieval
Jahna Otterbacher
|
Güneş Erkan
|
Dragomir Radev
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
2004
pdf
bib
Comparing Semantically Related Sentences: The Case of Paraphrase Versus Subsumption
Jahna Otterbacher
|
Dragomir Radev
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics
pdf
bib
abs
RevisionBank: A Resource for Revision-based Multi-document Summarization and Evaluation
Jahna Otterbacher
|
Dragomir Radev
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Multi-document summaries produced via sentence extraction often suffer from a number of cohesion problems, including dangling anaphora, sudden shifts in topic and incorrect or awkward chronological ordering. Therefore, the development of an automated revision process to correct such problems is a research area of current interest. We present the RevisionBank, a corpus of 240 extractive, multi-document summaries that have been manually revised to promote cohesion. The summaries were revised by six linguistic students using a constrained set of revision operations that we previously developed. In the current paper, we describe the process of developing a taxonomy of cohesion problems and corrective revision operators that address such problems, as well as an annotation schema for our corpus. Finally, we discuss how our taxonomy and corpus can be used for the study of revision-based multi-document summarization as well as for summary evaluation.
pdf
bib
abs
CST Bank: A Corpus for the Study of Cross-document Structural Relationships
Dragomir Radev
|
Jahna Otterbacher
|
Zhu Zhang
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Clusters of multiple news stories related to the same topic exhibit a number of interesting properties. For example, when documents have been published at various points in time or by different authors or news agencies, one finds many instances of paraphrasing, information overlap and even contradiction. The current paper presents the Cross-document Structure Theory (CST) Bank, a collection of multi-document clusters in which pairs of sentences from different documents have been annotated for cross-document structure theory relationships. We will describe how we built the corpus, including our method for reducing the number of sentence pairs to be annotated by our hired judges, using lexical similarity measures. Finally, we will describe how CST and the CST Bank can be applied to different research areas such as multi-document summarization.
pdf
bib
MEAD - A Platform for Multidocument Multilingual Text Summarization
Dragomir Radev
|
Timothy Allison
|
Sasha Blair-Goldensohn
|
John Blitzer
|
Arda Çelebi
|
Stanko Dimitrov
|
Elliott Drabek
|
Ali Hakim
|
Wai Lam
|
Danyu Liu
|
Jahna Otterbacher
|
Hong Qi
|
Horacio Saggion
|
Simone Teufel
|
Michael Topper
|
Adam Winkel
|
Zhu Zhang
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
2002
pdf
bib
Revisions that improve cohesion in multi-document summaries: a preliminary study
Jahna C. Otterbacher
|
Dragomir R. Radev
|
Airong Luo
Proceedings of the ACL-02 Workshop on Automatic Summarization