Benno Uthayasooriyar
2026
DocPolarBERT: A Pre-trained Model for Document Understanding with Relative Polar Coordinate Encoding of Layout Structures
Benno Uthayasooriyar | Antoine Ly | Franck Vermet | Caio Corro
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Benno Uthayasooriyar | Antoine Ly | Franck Vermet | Caio Corro
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
We propose a novel self-attention mechanism for document understanding that takes into account text block positions in relative polar coordinate system rather than the Cartesian one. Based on this mechanism, we build DocPolarBERT, a layout-aware BERT model for document understanding that eliminates the need for absolute 2D positional embeddings. Despite being pre-trained on a dataset more than six times smaller than the widely used IIT-CDIP corpus, DocPolarBERT achieves state-of-the-art results. These results demonstrate that a carefully designed attention mechanism can compensate for reduced pre-training data, offering an efficient and effective alternative for document understanding.
2025
Training LayoutLM from Scratch for Efficient Named-Entity Recognition in the Insurance Domain
Benno Uthayasooriyar | Antoine Ly | Franck Vermet | Caio Corro
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Benno Uthayasooriyar | Antoine Ly | Franck Vermet | Caio Corro
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Generic pre-trained neural networks may struggle to produce good results in specialized domains like finance and insurance. This is due to a domain mismatch between training data and downstream tasks, as in-domain data are often scarce due to privacy constraints. In this work, we compare different pre-training strategies for LayoutLM. We show that using domain-relevant documents improves results on a named-entity recognition (NER) problem using a novel dataset of anonymized insurance-related financial documents called PAYSLIPS. Moreover, we show that we can achieve competitive results using a smaller and faster model.
Few-shot domain adaptation for named-entity recognition via joint constrained k-means and subspace selection
Ayoub Hammal | Benno Uthayasooriyar | Caio Corro
Proceedings of the 31st International Conference on Computational Linguistics
Ayoub Hammal | Benno Uthayasooriyar | Caio Corro
Proceedings of the 31st International Conference on Computational Linguistics
Named-entity recognition (NER) is a task that typically requires large annotated datasets, which limits its applicability across domains with varying entity definitions. This paper addresses few-shot NER, aiming to transfer knowledge to new domains with minimal supervision. Unlike previous approaches that rely solely on limited annotated data, we propose a weakly-supervised algorithm that combines small labeled datasets with large amounts of unlabeled data. Our method extends the k-means algorithm with label supervision, cluster size constraints, and domain-specific discriminative subspace selection. This unified framework achieves state-of-the-art results in few-shot NER, demonstrating its effectiveness in leveraging unlabeled data and adapting to domain-specific challenges.