Krzysztof Hwaszcz


2023

pdf bib
ISO 24617-2 on a cusp of languages
Krzysztof Hwaszcz | Marcin Oleksy | Aleksandra Domogała | Jan Wieczorek
Proceedings of the 19th Joint ACL-ISO Workshop on Interoperable Semantics (ISA-19)

The article discusses the challenges of cross-linguistic dialogue act annotation, which involves using methods developed for one language to annotate conversations in another language. The article specifically focuses on the research on dialogue act annotation in Polish, based on the ISO standard developed for English. The article examines the differences between Polish and English in dialogue act annotation based on selected examples from DiaBiz.Kom corpus, such as the use of honorifics in Polish, the use of inflection to convey meaning in Polish, the tendency to use complex sentence structures in Polish, and the cultural differences that may play a role in the annotation of dialogue acts. The article also discusses the creation of DiaBiz.Kom, a Polish dialogue corpus based on ISO 24617-2 standard applied to 1100 transcripts.

2022

pdf bib
DiaBiz.Kom - towards a Polish Dialogue Act Corpus Based on ISO 24617-2 Standard
Marcin Oleksy | Jan Wieczorek | Dorota Drużyłowska | Julia Klyus | Aleksandra Domogała | Krzysztof Hwaszcz | Hanna Kędzierska | Daria Mikoś | Anita Wróż
Proceedings of the 29th International Conference on Computational Linguistics

This article presents the specification and evaluation of DiaBiz.Kom – the corpus of dialogue texts in Polish. The corpus contains transcriptions of telephone conversations conducted according to a prepared scenario. The transcripts of conversations have been manually annotated with a layer of information concerning communicative functions. DiaBiz.Kom is the first corpus of this type prepared for the Polish language and will be used to develop a system of dialog analysis and modules for creating advanced chatbots.