A Bilingual Discourse Corpus and Its Applications

Yang Liu, Jiajun Zhang, Chengqing Zong, Yating Yang, Xi Zhou


Abstract
Existing discourse research only focuses on the monolingual languages and the inconsistency between languages limits the power of the discourse theory in multilingual applications such as machine translation. To address this issue, we design and build a bilingual discource corpus in which we are currently defining and annotating the bilingual elementary discourse units (BEDUs). The BEDUs are then organized into hierarchical structures. Using this discourse style, we have annotated nearly 20K LDC sentences. Finally, we design a bilingual discourse based method for machine translation evaluation and show the effectiveness of our bilingual discourse annotations.
Anthology ID:
L16-1159
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1002–1007
Language:
URL:
https://aclanthology.org/L16-1159
DOI:
Bibkey:
Cite (ACL):
Yang Liu, Jiajun Zhang, Chengqing Zong, Yating Yang, and Xi Zhou. 2016. A Bilingual Discourse Corpus and Its Applications. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1002–1007, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
A Bilingual Discourse Corpus and Its Applications (Liu et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1159.pdf