Better punctuation prediction with hierarchical phrase-based translation

Stephan Peitz, Markus Freitag, Hermann Ney


Abstract
Punctuation prediction is an important task in spoken language translation and can be performed by using a monolingual phrase-based translation system to translate from unpunctuated to text with punctuation. However, a punctuation prediction system based on phrase-based translation is not able to capture long-range dependencies between words and punctuation marks. In this paper, we propose to employ hierarchical translation in place of phrase-based translation and show that this approach is more robust for unseen word sequences. Furthermore, we analyze different optimization criteria for tuning the scaling factors of a monolingual statistical machine translation system. In our experiments, we compare the new approach with other punctuation prediction methods and show improvements in terms of F1-Score and BLEU on the IWSLT 2014 German→English and English→French translation tasks.
Anthology ID:
2014.iwslt-papers.17
Volume:
Proceedings of the 11th International Workshop on Spoken Language Translation: Papers
Month:
December 4-5
Year:
2014
Address:
Lake Tahoe, California
Editors:
Marcello Federico, Sebastian Stüker, François Yvon
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
271–278
Language:
URL:
https://aclanthology.org/2014.iwslt-papers.17
DOI:
Bibkey:
Cite (ACL):
Stephan Peitz, Markus Freitag, and Hermann Ney. 2014. Better punctuation prediction with hierarchical phrase-based translation. In Proceedings of the 11th International Workshop on Spoken Language Translation: Papers, pages 271–278, Lake Tahoe, California.
Cite (Informal):
Better punctuation prediction with hierarchical phrase-based translation (Peitz et al., IWSLT 2014)
Copy Citation:
PDF:
https://aclanthology.org/2014.iwslt-papers.17.pdf