Cataldo Musto


2022

pdf bib
swapUNIBA@FinTOC2022: Fine-tuning Pre-trained Document Image Analysis Model for Title Detection on the Financial Domain
Pierluigi Cassotti | Cataldo Musto | Marco DeGemmis | Georgios Lekkas | Giovanni Semeraro
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022

In this paper, we introduce the results of our submitted system to the FinTOC 2022 task. We address the task using a two-stage process: first, we detect titles using Document Image Analysis, then we train a supervised model for the hierarchical level prediction. We perform Document Image Analysis using a pre-trained Faster R-CNN on the PublyaNet dataset. We fine-tuned the model on the FinTOC 2022 training set. We extract orthographic and layout features from detected titles and use them to train a Random Forest model to predict the title level. The proposed system ranked #1 on both Title Detection and the Table of Content extraction tasks for Spanish. The system ranked #3 on both the two subtasks for English and French.

2020

pdf bib
Exploiting Distributional Semantics Models for Natural Language Context-aware Justifications for Recommender Systems
Giuseppe Spillo | Cataldo Musto | Marco de Gemmis | Pasquale Lops | Giovanni Semeraro
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

2019

pdf bib
HateChecker: a Tool to Automatically Detect Hater Users in Online Social Networks
Cataldo Musto | Angelo Sansonetti | Marco Polignano | Giovanni Semeraro | Marco Stranisci
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

pdf bib
Computational Linguistics Against Hate: Hate Speech Detection and Visualization on Social Media in the “Contro L’Odio” Project
Arthur T. E. Capozzi | Mirko Lai | Valerio Basile | Cataldo Musto | Marco Polignano | Fabio Poletto | Manuela Sanguinetti | Cristina Bosco | Viviana Patti | Giancarlo Ruffo | Giovanni Semeraro | Marco Stranisci
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)