DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing

Zhengyuan Liu, Ke Shi, Nancy Chen


Abstract
Text discourse parsing weighs importantly in understanding information flow and argumentative structure in natural language, making it beneficial for downstream tasks. While previous work significantly improves the performance of RST discourse parsing, they are not readily applicable to practical use cases: (1) EDU segmentation is not integrated into most existing tree parsing frameworks, thus it is not straightforward to apply such models on newly-coming data. (2) Most parsers cannot be used in multilingual scenarios, because they are developed only in English. (3) Parsers trained from single-domain treebanks do not generalize well on out-of-domain inputs. In this work, we propose a document-level multilingual RST discourse parsing framework, which conducts EDU segmentation and discourse tree parsing jointly. Moreover, we propose a cross-translation augmentation strategy to enable the framework to support multilingual parsing and improve its domain generality. Experimental results show that our model achieves state-of-the-art performance on document-level multilingual RST parsing in all sub-tasks.
Anthology ID:
2021.codi-main.15
Volume:
Proceedings of the 2nd Workshop on Computational Approaches to Discourse
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic and Online
Editors:
Chloé Braud, Christian Hardmeier, Junyi Jessy Li, Annie Louis, Michael Strube, Amir Zeldes
Venue:
CODI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
154–164
Language:
URL:
https://aclanthology.org/2021.codi-main.15
DOI:
10.18653/v1/2021.codi-main.15
Bibkey:
Cite (ACL):
Zhengyuan Liu, Ke Shi, and Nancy Chen. 2021. DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing. In Proceedings of the 2nd Workshop on Computational Approaches to Discourse, pages 154–164, Punta Cana, Dominican Republic and Online. Association for Computational Linguistics.
Cite (Informal):
DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing (Liu et al., CODI 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.codi-main.15.pdf
Video:
 https://aclanthology.org/2021.codi-main.15.mp4
Code
 seq-to-mind/DMRST_Parser
Data
RST-DT