Kitten: a tool for normalizing HTML and extracting its textual content Mathieu-Henri Falco author Véronique Moriceau author Anne Vilnat author 2012-05 text Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12) Nicoletta Calzolari editor Khalid Choukri editor Thierry Declerck editor Mehmet Uğur Doğan editor Bente Maegaard editor Joseph Mariani editor Asuncion Moreno editor Jan Odijk editor Stelios Piperidis editor European Language Resources Association (ELRA) Istanbul, Turkey conference publication falco-etal-2012-kitten https://aclanthology.org/L12-1250/ 2012-05 2261 2267