Konstantina Liagkou
2024
Challenging Error Correction in Recognised Byzantine Greek
John Pavlopoulos
|
Vasiliki Kougia
|
Esteban Garces Arias
|
Paraskevi Platanou
|
Stepan Shabalin
|
Konstantina Liagkou
|
Emmanouil Papadatos
|
Holger Essler
|
Jean-Baptiste Camps
|
Franz Fischer
Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Automatic correction of errors in Handwritten Text Recognition (HTR) output poses persistent challenges yet to be fully resolved. In this study, we introduce a shared task aimed at addressing this challenge, which attracted 271 submissions, yielding only a handful of promising approaches. This paper presents the datasets, the most effective methods, and an experimental analysis in error-correcting HTRed manuscripts and papyri in Byzantine Greek, the language that followed Classical and preceded Modern Greek. By using recognised and transcribed data from seven centuries, the two best-performing methods are compared, one based on a neural encoder-decoder architecture and the other based on engineered linguistic rules. We show that the recognition error rate can be reduced by both, up to 2.5 points at the level of characters and up to 15 at the level of words, while also elucidating their respective strengths and weaknesses.
2022
A Study of Distant Viewing of ukiyo-e prints
Konstantina Liagkou
|
John Pavlopoulos
|
Ewa Machotka
Proceedings of the Thirteenth Language Resources and Evaluation Conference
This paper contributes to studying relationships between Japanese topography and places featured in early modern landscape prints, so-called ukiyo-e or ‘pictures of the floating world’. The printed inscriptions on these images feature diverse place-names, both man-made and natural formations. However, due to the corpus’s richness and diversity, the precise nature of artistic mediation of the depicted places remains little understood. In this paper, we explored a new analytical approach based on the macroanalysis of images facilitated by Natural Language Processing technologies. This paper presents a small dataset with inscriptions on prints that have been annotated by an art historian for included place-name entities. Our dataset is released for public use. By fine-tuning and applying a Japanese BERT-based Name Entity Recogniser, we provide a use-case of a macroanalysis of a visual dataset that is hosted by the digital database of the Art Research Center at the Ritsumeikan University, Kyoto. Our work studies the relationship between topography and its visual renderings in early modern Japanese ukiyo-e landscape prints, demonstrating how an art historian’s work can be improved with Natural Language Processing toward distant viewing of visual datasets. We release our dataset and code for public use: https://github.com/connalia/ukiyo-e_meisho_nlp
Search
Co-authors
- John Pavlopoulos 2
- Vasiliki Kougia 1
- Esteban Garces Arias 1
- Paraskevi Platanou 1
- Stepan Shabalin 1
- show all...