Antonia Karaisl


2023

pdf bib
A Question of Confidence: Using OCR Technology for Script analysis
Antonia Karaisl
Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages

The following article proposes a method employing the Tesseract OCR engine to aid palaeographic analysis and scribal identification. Repurposing the so-called confidence score provided by the OCR engine, different methods of visualization are used to surface differences between font families, script types and manuscript hands.
Search
Co-authors
    Venues