A multilingual procedure for dictionary-based sentence alignment

Adam Meyers, Michiko Kosaka, Ralph Grishman


Abstract
This paper describes a sentence alignment technique based on a machine readable dictionary. Alignment takes place in a single pass through the text, based on the scores of matches between pairs of source and target sentences. Pairings consisting of sets of matches are evaluated using a version of the Gale-Shapely solution to the stable marriage problem. An algorithm is described which can handle N-to-1 (or 1-to-N) matches, for n ≥ 0, i.e., deletions, 1-to-1 (including scrambling), and 1-to-many matches. A simple frequency based method for acquiring supplemental dictionary entries is also discussed. We achieve high quality alignments using available bilingual dictionaries, both for closely related language pairs (Spanish/English) and more distantly related pairs (Japanese/English).
Anthology ID:
1998.amta-papers.17
Volume:
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
October 28-31
Year:
1998
Address:
Langhorne, PA, USA
Editors:
David Farwell, Laurie Gerber, Eduard Hovy
Venue:
AMTA
SIG:
Publisher:
Springer
Note:
Pages:
187–198
Language:
URL:
https://link.springer.com/chapter/10.1007/3-540-49478-2_18
DOI:
Bibkey:
Cite (ACL):
Adam Meyers, Michiko Kosaka, and Ralph Grishman. 1998. A multilingual procedure for dictionary-based sentence alignment. In Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 187–198, Langhorne, PA, USA. Springer.
Cite (Informal):
A multilingual procedure for dictionary-based sentence alignment (Meyers et al., AMTA 1998)
Copy Citation:
PDF:
https://link.springer.com/chapter/10.1007/3-540-49478-2_18