Ella Schad


2024

pdf bib
The RIP Corpus of Collaborative Hypothesis-Making
Ella Schad | Jacky Visser | Chris Reed
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The dearth of literature combining hypothesis-making and collaborative problem solving presents a problem in the investigation into how hypotheses are generated in group environments. A new dataset, the Resolving Investigative hyPotheses (RIP) corpus, is introduced to address this issue. The corpus uses the fictionalised environment of a murder investigation game. An artificial environment restricts the number of possible hypotheses compared to real-world situations, allowing a deeper dive into the data. In three groups of three, participants collaborated to solve the mystery: two groups came to the wrong conclusion in different ways, and one succeeded in solving the game. RIP is a 49k-word dialogical corpus, consisting of three sub-corpora, annotated for argumentation and discourse structure on the basis of Inference Anchoring Theory. The corpus shows the emergent roles individuals took on and the strategies the groups employed, showing what can be gained through a deeper exploration of this domain. The corpus bridges the gap between these two areas – hypothesis generation and collaborative problem solving – by using an environment rich with potential for hypothesising within a highly collaborative space.

2022

pdf bib
Disagreement Space in Argument Analysis
Annette Hautli-Janisz | Ella Schad | Chris Reed
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022

For a highly subjective task such as recognising speaker intention and argumentation, the traditional way of generating gold standards is to aggregate a number of labels into a single one. However, this seriously neglects the underlying richness that characterises discourse and argumentation and is also, in some cases, straightforwardly impossible. In this paper, we present QT30nonaggr, the first corpus of non-aggregated argument annotation, which will be openly available upon publication. QT30nonaggr encompasses 10% of QT30, the largest corpus of dialogical argumentation and analysed broadcast political debate currently available with 30 episodes of BBC’s ‘Question Time’ from 2020 and 2021. Based on a systematic and detailed investigation of annotation judgements across all steps of the annotation process, we structure the disagreement space with a taxonomy of the types of label disagreements in argument annotation, identifying the categories of annotation errors, fuzziness and ambiguity.