Beata Wójtowicz


2022

pdf bib
Error Correction Environment for the Polish Parliamentary Corpus
Maciej Ogrodniczuk | Michał Rudolf | Beata Wójtowicz | Sonia Janicka
Proceedings of the Workshop ParlaCLARIN III within the 13th Language Resources and Evaluation Conference

The paper introduces the environment for detecting and correcting various kinds of errors in the Polish Parliamentary Corpus. After performing a language model-based error detection experiment which resulted in too many false positives, a simpler rule-based method was introduced and is currently used in the process of manual verification of corpus texts. The paper presents types of errors detected in the corpus, the workflow of the correction process and the tools newly implemented for this purpose. To facilitate comparison of a target corpus XML file with its usually graphical PDF source, a new mechanism for inserting PDF page markers into XML was developed and is used for displaying a single source page corresponding to a given place in the resulting XML directly in the error correction environment.

2009

pdf bib
A Repository of Free Lexical Resources for African Languages: The Project and the Method
Piotr Bański | Beata Wójtowicz
Proceedings of the First Workshop on Language Technologies for African Languages

2007

pdf bib
Towards the Automatic Extraction of Definitions in Slavic
Adam Przepiórkowski | Łukasz Degórski | Miroslav Spousta | Kiril Simov | Petya Osenova | Lothar Lemnitzer | Vladislav Kuboň | Beata Wójtowicz
Proceedings of the Workshop on Balto-Slavonic Natural Language Processing