MASSAlign: Alignment and Annotation of Comparable Documents

Gustavo Paetzold, Fernando Alva-Manchego, Lucia Specia


Abstract
We introduce MASSAlign: a Python library for the alignment and annotation of monolingual comparable documents. MASSAlign offers easy-to-use access to state of the art algorithms for paragraph and sentence-level alignment, as well as novel algorithms for word-level annotation of transformation operations between aligned sentences. In addition, MASSAlign provides a visualization module to display and analyze the alignments and annotations performed.
Anthology ID:
I17-3001
Volume:
Proceedings of the IJCNLP 2017, System Demonstrations
Month:
November
Year:
2017
Address:
Tapei, Taiwan
Editors:
Seong-Bae Park, Thepchai Supnithi
Venue:
IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–4
Language:
URL:
https://aclanthology.org/I17-3001
DOI:
Bibkey:
Cite (ACL):
Gustavo Paetzold, Fernando Alva-Manchego, and Lucia Specia. 2017. MASSAlign: Alignment and Annotation of Comparable Documents. In Proceedings of the IJCNLP 2017, System Demonstrations, pages 1–4, Tapei, Taiwan. Association for Computational Linguistics.
Cite (Informal):
MASSAlign: Alignment and Annotation of Comparable Documents (Paetzold et al., IJCNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/I17-3001.pdf
Data
Newsela