Esteban Garces Arias


2024

pdf bib
Challenging Error Correction in Recognised Byzantine Greek
John Pavlopoulos | Vasiliki Kougia | Esteban Garces Arias | Paraskevi Platanou | Stepan Shabalin | Konstantina Liagkou | Emmanouil Papadatos | Holger Essler | Jean-Baptiste Camps | Franz Fischer
Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)

Automatic correction of errors in Handwritten Text Recognition (HTR) output poses persistent challenges yet to be fully resolved. In this study, we introduce a shared task aimed at addressing this challenge, which attracted 271 submissions, yielding only a handful of promising approaches. This paper presents the datasets, the most effective methods, and an experimental analysis in error-correcting HTRed manuscripts and papyri in Byzantine Greek, the language that followed Classical and preceded Modern Greek. By using recognised and transcribed data from seven centuries, the two best-performing methods are compared, one based on a neural encoder-decoder architecture and the other based on engineered linguistic rules. We show that the recognition error rate can be reduced by both, up to 2.5 points at the level of characters and up to 15 at the level of words, while also elucidating their respective strengths and weaknesses.

pdf bib
Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation
Esteban Garces Arias | Julian Rodemann | Meimingwei Li | Christian Heumann | Matthias Aßenmacher
Findings of the Association for Computational Linguistics: EMNLP 2024

Despite the remarkable capabilities of large language models, generating high-quality text remains a challenging task. Numerous decoding strategies—such as beam search, sampling with temperature, top‐k sampling, nucleus (top‐p) sampling, typical decoding, contrastive decoding, and contrastive search—have been proposed to address these challenges by improving coherence, diversity, and resemblance to human-generated text. In this study, we introduce Adaptive Contrastive Search (ACS), a novel decoding strategy that extends contrastive search (CS) by incorporating an adaptive degeneration penalty informed by the model’s estimated uncertainty at each generation step. ACS aims to enhance creativity and diversity while maintaining coherence to produce high-quality outputs. Extensive experiments across various model architectures, languages, and datasets demonstrate that our approach improves both creativity and coherence, underscoring its effectiveness in text-generation tasks. We release our code, datasets, and models to facilitate further research.

2023

pdf bib
Automatic Transcription of Handwritten Old Occitan Language
Esteban Garces Arias | Vallari Pai | Matthias Schöffel | Christian Heumann | Matthias Aßenmacher
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

While existing neural network-based approaches have shown promising results in Handwritten Text Recognition (HTR) for high-resource languages and standardized/machine-written text, their application to low-resource languages often presents challenges, resulting in reduced effectiveness. In this paper, we propose an innovative HTR approach that leverages the Transformer architecture for recognizing handwritten Old Occitan language. Given the limited availability of data, which comprises only word pairs of graphical variants and lemmas, we develop and rely on elaborate data augmentation techniques for both text and image data. Our model combines a custom-trained Swin image encoder with a BERT text decoder, which we pre-train using a large-scale augmented synthetic data set and fine-tune on the small human-labeled data set. Experimental results reveal that our approach surpasses the performance of current state-of-the-art models for Old Occitan HTR, including open-source Transformer-based models such as a fine-tuned TrOCR and commercial applications like Google Cloud Vision. To nurture further research and development, we make our models, data sets, and code publicly available.

pdf bib
A tailored Handwritten-Text-Recognition System for Medieval Latin
Philipp Koch | Gilary Vera Nuñez | Esteban Garces Arias | Christian Heumann | Matthias Schöffel | Alexander Häberlin | Matthias Assenmacher
Proceedings of the Ancient Language Processing Workshop

The Bavarian Academy of Sciences and Humanities aims to digitize the Medieval Latin Dictionary. This dictionary entails record cards referring to lemmas in medieval Latin, a low-resource language. A crucial step of the digitization process is the handwritten text recognition (HTR) of the handwritten lemmas on the record cards. In our work, we introduce an end-to-end pipeline, tailored for the medieval Latin dictionary, for locating, extracting, and transcribing the lemmas. We employ two state-of-the-art image segmentation models to prepare the initial data set for the HTR task. Further, we experiment with different transformer-based models and conduct a set of experiments to explore the capabilities of different combinations of vision encoders with a GPT-2 decoder. Additionally, we also apply extensive data augmentation resulting in a highly competitive model. The best-performing setup achieved a character error rate of 0.015, which is even superior to the commercial Google Cloud Vision model, and shows more stable performance.