Barbora Hladká

Also published as: B. Hladká, Barbora Hladka


2022

pdf bib
Annotating Attribution in Czech News Server Articles
Barbora Hladka | Jiří Mírovský | Matyáš Kopp | Václav Moravec
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper focuses on detection of sources in the Czech articles published on a news server of Czech public radio. In particular, we search for attribution in sentences and we recognize attributed sources and their sentence context (signals). We organized a crowdsourcing annotation task that resulted in a data set of 2,167 stories with manually recognized signals and sources. In addition, the sources were classified into the classes of named and unnamed sources.

2020

pdf bib
Compiling Czech Parliamentary Stenographic Protocols into a Corpus
Barbora Hladka | Matyáš Kopp | Pavel Straňák
Proceedings of the Second ParlaCLARIN Workshop

The Parliament of the Czech Republic consists of two chambers: the Chamber of Deputies (Lower House) and the Senate (Upper House). In our work, we focus on agenda and documents that relate to the Chamber of Deputies exclusively. We pay particular attention to stenographic protocols that record the Chamber of Deputies’ meetings. Our overall goal is to (1) compile the protocols into a ParlaCLARIN TEI encoded corpus, (2) make this corpus accessible and searchable in the TEITOK web-based platform, (3) annotate the corpus using the modules available in TEITOK, e.g. detect and recognize named entities, and (4) highlight the annotations in TEITOK. In addition, we add two more goals that we consider innovative: (5) update the corpus every time a new stenographic protocol is published online by the Chambers of Deputies and (6) expose the annotations as the linked open data in order to improve the protocols’ interoperability with other existing linked open data. This paper is devoted to the goals (1) and (5).

2018

pdf bib
Czech Legal Text Treebank 2.0
Vincent Kríž | Barbora Hladká
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Combining Textual and Speech Features in the NLI Task Using State-of-the-Art Machine Learning Techniques
Pavel Ircing | Jan Švec | Zbyněk Zajíc | Barbora Hladká | Martin Holub
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

We summarize the involvement of our CEMI team in the ”NLI Shared Task 2017”, which deals with both textual and speech input data. We submitted the results achieved by using three different system architectures; each of them combines multiple supervised learning models trained on various feature sets. As expected, better results are achieved with the systems that use both the textual data and the spoken responses. Combining the input data of two different modalities led to a rather dramatic improvement in classification performance. Our best performing method is based on a set of feed-forward neural networks whose hidden-layer outputs are combined together using a softmax layer. We achieved a macro-averaged F1 score of 0.9257 on the evaluation (unseen) test set and our team placed first in the main task together with other three teams.

pdf bib
Understanding Non-Native Writings: Can a Parser Help?
Jirka Hana | Barbora Hladká
Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)

We present a pilot study on parsing non-native texts written by learners of Czech. We performed experiments that have shown that at least high-level syntactic functions, like subject, predicate, and object, can be assigned based on a parser trained on standard native language.

2016

pdf bib
Czech Legal Text Treebank 1.0
Vincent Kríž | Barbora Hladká | Zdeňka Urešová
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We introduce a new member of the family of Prague dependency treebanks. The Czech Legal Text Treebank 1.0 is a morphologically and syntactically annotated corpus of 1,128 sentences. The treebank contains texts from the legal domain, namely the documents from the Collection of Laws of the Czech Republic. Legal texts differ from other domains in several language phenomena influenced by rather high frequency of very long sentences. A manual annotation of such sentences presents a new challenge. We describe a strategy and tools for this task. The resulting treebank can be explored in various ways. It can be downloaded from the LINDAT/CLARIN repository and viewed locally using the TrEd editor or it can be accessed on-line using the KonText and TreeQuery tools.

pdf bib
Improving Dependency Parsing Using Sentence Clause Charts
Vincent Kríž | Barbora Hladká
Proceedings of the ACL 2016 Student Research Workshop

2015

pdf bib
RExtractor: a Robust Information Extractor
Vincent Kríž | Barbora Hladká
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

2014

pdf bib
Sentence diagrams: their evaluation and combination
Jirka Hana | Barbora Hladká | Ivana Lukšová
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop

2013

pdf bib
Feature Engineering in the NLI Shared Task 2013: Charles University Submission Report
Barbora Hladká | Martin Holub | Vincent Kríž
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

2012

pdf bib
Getting more data – Schoolkids as annotators
Jirka Hana | Barbora Hladká
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a new way to get more morphologically and syntactically annotated data. We have developed an annotation editor tailored to school children to involve them in text annotation. Using this editor, they practice morphology and dependency-based syntax in the same way as they normally do at (Czech) schools, without any special training. Their annotation is then automatically transformed into the target annotation schema. The editor is designed to be language independent, however the subsequent transformation is driven by the annotation framework we are heading for. In our case, the object language is Czech and the target annotation scheme corresponds to the Prague Dependency Treebank annotation framework.

2009

pdf bib
Designing a Language Game for Collecting Coreference Annotation
Barbora Hladká | Jiří Mírovský | Pavel Schlesinger
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf bib
Syntactic annotation of spoken utterances: A case study on the Czech Academic Corpus
Barbora Hladká | Zdeňka Urešová
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf bib
Play the Language: Play Coreference
Barbora Hladká | Jiří Mírovský | Pavel Schlesinger
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
An Annotated Corpus Outside Its Original Context: A Corpus-Based Exercise Book
Barbora Hladká | Ondřej Kučera
Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications

2005

pdf bib
Prague Dependency Treebank as an Exercise Book of Czech
Barbora Hladká | Ondřej Kučera
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations

2001

pdf bib
Robust Knowledge Discovery from Parallel Speech and Text Sources
F. Jelinek | W. Byrne | S. Khudanpur | B. Hladká | H. Ney | F. J. Och | J. Cuřín | J. Psutka
Proceedings of the First International Conference on Human Language Technology Research

2000

pdf bib
The Context (not only) for Humans
Barbora Hladká
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

1998

pdf bib
Tagging Inflective Languages: Prediction of Morphological Categories for a Rich Structured Tagset
Jan Hajič | Barbora Hladká
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Tagging Inflective Languages: Prediction of Morphological Categories for a Rich, Structured Tagset
Jan Hajic | Barbora Hladka
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

1997

pdf bib
Probabilistic and Rule-Based Tagger of an Inflective Language- a Comparison
Jan Hajic | Barbora Hladka
Fifth Conference on Applied Natural Language Processing