Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser

Elena Chistova

Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser

Abstract

We introduce UniRST, the first unified RST-style discourse parser capable of handling 18 treebanks in 11 languages without modifying their relation inventories. To overcome inventory incompatibilities, we propose and evaluate two training strategies: Multi-Head, which assigns separate relation classification layer per inventory, and Masked-Union, which enables shared parameter training through selective label masking. We first benchmark mono-treebank parsing with a simple yet effective augmentation technique for low-resource settings. We then train a unified model and show that (1) the parameter efficient Masked-Union approach is also the strongest, and (2) UniRST outperforms 16 of 18 mono‐treebank baselines, demonstrating the advantages of a single-model, multilingual end-to-end discourse parsing across diverse resources.

Anthology ID:: 2025.codi-1.17
Volume:: Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025)
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Michael Strube, Chloe Braud, Christian Hardmeier, Junyi Jessy Li, Sharid Loaiciga, Amir Zeldes, Chuyuan Li
Venues:: CODI | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 197–208
Language:
URL:: https://aclanthology.org/2025.codi-1.17/
DOI:
Bibkey:
Cite (ACL):: Elena Chistova. 2025. Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser. In Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025), pages 197–208, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser (Chistova, CODI 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.codi-1.17.pdf

PDF Cite Search Fix data