Kata Ágnes Szűcs


2025

pdf bib
Recognizing the Structure and Content of Hungarian Civil Registers
Kata Ágnes Szűcs | Noémi Vadász | Zsolt Béla Záros
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

The study evaluates key steps in a system for processing data from digitized Hungarian state register records (1895-1980) into an SQL database. It examines how template selection and post-processing impact data accessibility and integration. The research details the compiled datasets, annotation processes, and evaluation functions used to measure processing quality, emphasizing template selection and post-processing to improve the overall workflow and the accuracy of the published data. An evaluation method for publishing structured data provides a model for similar projects.