Know thy Corpus! Robust Methods for Digital Curation of Web corpora Serge Sharoff author 2020-05 text eng Proceedings of the Twelfth Language Resources and Evaluation Conference Nicoletta Calzolari editor Frédéric Béchet editor Philippe Blache editor Khalid Choukri editor Christopher Cieri editor Thierry Declerck editor Sara Goggi editor Hitoshi Isahara editor Bente Maegaard editor Joseph Mariani editor Hélène Mazo editor Asuncion Moreno editor Jan Odijk editor Stelios Piperidis editor European Language Resources Association Marseille, France conference publication 979-10-95546-34-4 sharoff-2020-know https://aclanthology.org/2020.lrec-1.298/ 2020-05 2453 2460