Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

Patrick Huber, Giuseppe Carenini


Abstract
In this paper, we extend the line of BERTology work by focusing on the important, yet less explored, alignment of pre-trained and fine-tuned PLMs with large-scale discourse structures. We propose a novel approach to infer discourse information for arbitrarily long documents. In our experiments, we find that the captured discourse information is local and general, even across a collection of fine-tuning tasks. We compare the inferred discourse trees with supervised, distantly supervised and simple baselines to explore the structural overlap, finding that constituency discourse trees align well with supervised models, however, contain complementary discourse information.Lastly, we individually explore self-attention matrices to analyze the information redundancy. We find that similar discourse information is consistently captured in the same heads.
Anthology ID:
2022.naacl-main.170
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2376–2394
Language:
URL:
https://aclanthology.org/2022.naacl-main.170
DOI:
10.18653/v1/2022.naacl-main.170
Bibkey:
Cite (ACL):
Patrick Huber and Giuseppe Carenini. 2022. Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2376–2394, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models (Huber & Carenini, NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.170.pdf
Data
CNN/Daily MailMultiNLISQuADSST