CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs Ahmed El-Kishky author Vishrav Chaudhary author Francisco Guzmán author Philipp Koehn author 2020-11 text Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) Bonnie Webber editor Trevor Cohn editor Yulan He editor Yang Liu editor Association for Computational Linguistics Online conference publication el-kishky-etal-2020-ccaligned 10.18653/v1/2020.emnlp-main.480 https://aclanthology.org/2020.emnlp-main.480/ 2020-11 5960 5969