Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora Hainan Xu author Philipp Koehn author 2017-09 text Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Martha Palmer editor Rebecca Hwa editor Sebastian Riedel editor Association for Computational Linguistics Copenhagen, Denmark conference publication xu-koehn-2017-zipporah 10.18653/v1/D17-1319 https://aclanthology.org/D17-1319/ 2017-09 2945 2950