Naver Labs Europe’s Systems for the Document-Level Generation and Translation Task at WNGT 2019

Fahimeh Saleh, Alexandre Berard, Ioan Calapodescu, Laurent Besacier


Abstract
Recently, neural models led to significant improvements in both machine translation (MT) and natural language generation tasks (NLG). However, generation of long descriptive summaries conditioned on structured data remains an open challenge. Likewise, MT that goes beyond sentence-level context is still an open issue (e.g., document-level MT or MT with metadata). To address these challenges, we propose to leverage data from both tasks and do transfer learning between MT, NLG, and MT with source-side metadata (MT+NLG). First, we train document-based MT systems with large amounts of parallel data. Then, we adapt these models to pure NLG and MT+NLG tasks by fine-tuning with smaller amounts of domain-specific data. This end-to-end NLG approach, without data selection and planning, outperforms the previous state of the art on the Rotowire NLG task. We participated to the “Document Generation and Translation” task at WNGT 2019, and ranked first in all tracks.
Anthology ID:
D19-5631
Volume:
Proceedings of the 3rd Workshop on Neural Generation and Translation
Month:
November
Year:
2019
Address:
Hong Kong
Venues:
EMNLP | NGT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
273–279
Language:
URL:
https://aclanthology.org/D19-5631
DOI:
10.18653/v1/D19-5631
Bibkey:
Cite (ACL):
Fahimeh Saleh, Alexandre Berard, Ioan Calapodescu, and Laurent Besacier. 2019. Naver Labs Europe’s Systems for the Document-Level Generation and Translation Task at WNGT 2019. In Proceedings of the 3rd Workshop on Neural Generation and Translation, pages 273–279, Hong Kong. Association for Computational Linguistics.
Cite (Informal):
Naver Labs Europe’s Systems for the Document-Level Generation and Translation Task at WNGT 2019 (Saleh et al., EMNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-5631.pdf
Data
RotoWire