Andrei Malchanau

2024

Fusing ISO 24617-2 Dialogue Acts and Application-Specific Semantic Content Annotations
Andrei Malchanau | Volha Petukhova | Harry Bunt
Proceedings of the 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation @ LREC-COLING 2024

Accurately annotated data determines whether a modern high-performing AI/ML model will present a suitable solution to a complex task/application challenge, or time and resources are wasted. The more adequate the structure of the incoming data is specified, the more efficient the data is translated to be used by the application. This paper presents an approach to an application-specific dialogue semantics design which integrates the dialogue act annotation standard ISO 24617-2 and various domain-specific semantic annotations. The proposed multi-scheme design offers a plausible and a rather powerful strategy to integrate, validate, extend and reuse existing annotations, and automatically generate code for dialogue system modules. Advantages and possible trade-offs are discussed.

2018

pdf bib

pdf bib

Towards Continuous Dialogue Corpus Creation: writing to corpus and generating from it
Andrei Malchanau | Volha Petukhova | Harry Bunt
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib abs

Modelling Multi-issue Bargaining Dialogues: Data Collection, Annotation Design and Corpus
Volha Petukhova | Christopher Stevens | Harmen de Weerd | Niels Taatgen | Fokie Cnossen | Andrei Malchanau
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The paper describes experimental dialogue data collection activities, as well semantically annotated corpus creation undertaken within EU-funded METALOGUE project(www.metalogue.eu). The project aims to develop a dialogue system with flexible dialogue management to enable system’s adaptive, reactive, interactive and proactive dialogue behavior in setting goals, choosing appropriate strategies and monitoring numerous parallel interpretation and management processes. To achieve these goals negotiation (or more precisely multi-issue bargaining) scenario has been considered as the specific setting and application domain. The dialogue corpus forms the basis for the design of task and interaction models of participants negotiation behavior, and subsequently for dialogue system development which would be capable to replace one of the negotiators. The METALOGUE corpus will be released to the community for research purposes.

pdf bib abs

The DialogBank
Harry Bunt | Volha Petukhova | Andrei Malchanau | Kars Wijnhoven | Alex Fang
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents the DialogBank, a new language resource consisting of dialogues with gold standard annotations according to the ISO 24617-2 standard. Some of these dialogues have been taken from existing corpora and have been re-annotated according to the ISO standard; others have been annotated directly according to the standard. The ISO 24617-2 annotations have been designed according to the ISO principles for semantic annotation, as formulated in ISO 24617-6. The DialogBank makes use of three alternative representation formats, which are shown to be interoperable.

2014

pdf bib abs

Interoperability of Dialogue Corpora through ISO 24617-2-based Querying
Volha Petukhova | Andrei Malchanau | Harry Bunt
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper explores a way of achieving interoperability: developing a query format for accessing existing annotated corpora whose expressions make use of the annotation language defined by the standard. The interpretation of expressions in the query implements a mapping from ISO 24617-2 concepts to those of the annotation scheme used in the corpus. We discuss two possible ways to query existing annotated corpora using DiAML. One way is to transform corpora into DiAML compliant format, and subsequently query these data using XQuery or XPath. The second approach is to define a DiAML query that can be directly used to retrieve requested information from the annotated data. Both approaches are valid. The first one presents a standard way of querying XML data. The second approach is a DiAML-oriented querying of dialogue act annotated data, for which we designed an interface. The proposed approach is tested on two important types of existing dialogue corpora: spoken two-person dialogue corpora collected and annotated within the HCRC Map Task paradigm, and multiparty face-to-face dialogues of the AMI corpus. We present the results and evaluate them with respect to accuracy and completeness through statistical comparisons between retrieved and manually constructed reference annotations.