Who Did You Blame When Your Project Failed? Designing a Corpus for Presupposition Generation in Cross-Examination Dialogues

Maria Francis, Julius Steuer, Dietrich Klakow, Volha Petukhova


Abstract
This paper introduces the corpus for the novel task of presupposition generation - a natural language generation problem where a model produces a list of presuppositions carried by the given input sentence, in the context of the presented research - given the cross-examination question. Two datasets, PECaN (Presupposition, Entailment, Contradiction and Neutral) and PGen (Presuppostion Generation), are designed to fine-tune existing BERT (CITATION) and T5 (CITATION) models for classification and generation tasks. Various corpora construction methods are proposed ranging from manual annotations, prompting the GPT 3.0 model, to augmenting data from the existing corpora. The fine-tuned models achieved high accuracy on the novel Presupposition as Natural Language Inference (PNLI) task which extends the traditional Natural Language Inference (NLI) incorporating instances of presupposition into classification. T5 outperforms BERT by broad margin achieving an overall accuracy of 84.35% compared to 71.85% of BERT, and specifically when classifying presuppositions (93% vs 73% respectively). Regarding presupposition generation, we observed that despite the limited amount of data used for fine-tuning, the model displays an emerging proficiency in generation presuppositions reaching ROUGE scores of 43.47, adhering to systematic patterns that mirror valid strategies for presupposition generation, although failed to generate the complete lists.
Anthology ID:
2024.lrec-main.1528
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
17564–17574
Language:
URL:
https://aclanthology.org/2024.lrec-main.1528
DOI:
Bibkey:
Cite (ACL):
Maria Francis, Julius Steuer, Dietrich Klakow, and Volha Petukhova. 2024. Who Did You Blame When Your Project Failed? Designing a Corpus for Presupposition Generation in Cross-Examination Dialogues. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 17564–17574, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Who Did You Blame When Your Project Failed? Designing a Corpus for Presupposition Generation in Cross-Examination Dialogues (Francis et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.1528.pdf