PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles Daniel Ferrés author Horacio Saggion author Francesco Ronzano author Àlex Bravo author 2018-05 text Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) Nicoletta Calzolari editor Khalid Choukri editor Christopher Cieri editor Thierry Declerck editor Sara Goggi editor Koiti Hasida editor Hitoshi Isahara editor Bente Maegaard editor Joseph Mariani editor Hélène Mazo editor Asuncion Moreno editor Jan Odijk editor Stelios Piperidis editor Takenobu Tokunaga editor European Language Resources Association (ELRA) Miyazaki, Japan conference publication ferres-etal-2018-pdfdigest https://aclanthology.org/L18-1298/ 2018-05