Hana Skoumalova
Also published as: Hana Skoumalová
2016
SYN2015: Representative Corpus of Contemporary Written Czech
Michal Křen | Václav Cvrček | Tomáš Čapka | Anna Čermáková | Milena Hnátková | Lucie Chlumská | Tomáš Jelínek | Dominika Kováříková | Vladimír Petkevič | Pavel Procházka | Hana Skoumalová | Michal Škrabal | Petr Truneček | Pavel Vondřička | Adrian Jan Zasina
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Michal Křen | Václav Cvrček | Tomáš Čapka | Anna Čermáková | Milena Hnátková | Lucie Chlumská | Tomáš Jelínek | Dominika Kováříková | Vladimír Petkevič | Pavel Procházka | Hana Skoumalová | Michal Škrabal | Petr Truneček | Pavel Vondřička | Adrian Jan Zasina
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
The paper concentrates on the design, composition and annotation of SYN2015, a new 100-million representative corpus of contemporary written Czech. SYN2015 is a sequel of the representative corpora of the SYN series that can be described as traditional (as opposed to the web-crawled corpora), featuring cleared copyright issues, well-defined composition, reliability of annotation and high-quality text processing. At the same time, SYN2015 is designed as a reflection of the variety of written Czech text production with necessary methodological and technological enhancements that include a detailed bibliographic annotation and text classification based on an updated scheme. The corpus has been produced using a completely rebuilt text processing toolchain called SynKorp. SYN2015 is lemmatized, morphologically and syntactically annotated with state-of-the-art tools. It has been published within the framework of the Czech National Corpus and it is available via the standard corpus query interface KonText at http://kontext.korpus.cz as well as a dataset in shuffled format.
2015
Analytic Morphology – Merging the Paradigmatic and Syntagmatic Perspective in a Treebank
Vladimír Petkevič | Alexandr Rosen | Hana Skoumalová | Přemysl Vítovec
The 5th Workshop on Balto-Slavic Natural Language Processing
Vladimír Petkevič | Alexandr Rosen | Hana Skoumalová | Přemysl Vítovec
The 5th Workshop on Balto-Slavic Natural Language Processing
2014
The SYN-series corpora of written Czech
Milena Hnátková | Michal Křen | Pavel Procházka | Hana Skoumalová
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Milena Hnátková | Michal Křen | Pavel Procházka | Hana Skoumalová
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
The paper overviews the SYN series of synchronic corpora of written Czech compiled within the framework of the Czech National Corpus project. It describes their design and processing with a focus on the annotation, i.e. lemmatization and morphological tagging. The paper also introduces SYN2013PUB, a new 935-million newspaper corpus of Czech published in 2013 as the most recent addition to the SYN series before planned revision of its architecture. SYN2013PUB can be seen as a completion of the series in terms of titles and publication dates of major Czech newspapers that are now covered by complete volumes in comparable proportions. All SYN-series corpora can be characterized as traditional, with emphasis on cleared copyright issues, well-defined composition, reliable metadata and high-quality data processing; their overall size currently exceeds 2.2 billion running words.
2000
Multilinguality in a Text Generation System For Three Slavic Languages
Geert-Jan Kruijff | Elke Teich | John Bateman | Ivana Kruijff-Korbayova | Hana Skoumalova | Serge Sharoff | Lena Sokolova | Tony Hartley | Kamenka Staykova | Jiri Hana
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
Geert-Jan Kruijff | Elke Teich | John Bateman | Ivana Kruijff-Korbayova | Hana Skoumalova | Serge Sharoff | Lena Sokolova | Tony Hartley | Kamenka Staykova | Jiri Hana
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
Resources for Multilingual Text Generation in Three Slavic Languages
John Bateman | Elke Teich | Geert-Jan Kruijff | Ivana Kruijff-Korbayová | Serge Sharoff | Hana Skoumalová
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
John Bateman | Elke Teich | Geert-Jan Kruijff | Ivana Kruijff-Korbayová | Serge Sharoff | Hana Skoumalová
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
1997
A Czech Morphological Lexicon
Hana Skoumalova
Computational Phonology: Third Meeting of the ACL Special Interest Group in Computational Phonology
Hana Skoumalova
Computational Phonology: Third Meeting of the ACL Special Interest Group in Computational Phonology
1995
An Automatic Procedure for Topic-Focus Identification
Eva Hajičová | Hana Skoumalová | Petr Sgall
Computational Linguistics, Volume 21, Number 1, March 1995
Eva Hajičová | Hana Skoumalová | Petr Sgall
Computational Linguistics, Volume 21, Number 1, March 1995
The dichotomy of topic and focus, based, in the Praguean Functional Generative Description, on the scale of communicative dynamism, is relevant not only for a possible placement of the sentence in a context, but also for its semantic interpretation. An automatic identification of topic and focus may use the input information on word order, on the systemic ordering of kinds of complementations (reflected by the underlying order of the items included in the focus), on definiteness, and on lexical semantic properties of words. An algorithm for the analysis of English sentences has been implemented and is discussed and illustrated on several examples.
1992
Search
Fix author
Co-authors
- John Bateman 2
- Milena Hnátková 2
- Geert-Jan M. Kruijff 2
- Ivana Kruijff-Korbayová 2
- Michal Křen 2
- Vladimir Petkevic 2
- Pavel Procházka 2
- Serge Sharoff 2
- Elke Teich 2
- Lucie Chlumská 1
- Václav Cvrček 1
- Eva Hajicova 1
- Jiri Hana 1
- Tony Hartley 1
- Tomáš Jelínek 1
- Dominika Kováříková 1
- Jarmila Panevová 1
- Alexandr Rosen 1
- Petr Sgall 1
- Lena Sokolova 1
- Kamenka Staykova 1
- Petr Truneček 1
- Pavel Vondřička 1
- Přemysl Vítovec 1
- Adrian Jan Zasina 1
- Tomáš Čapka 1
- Anna Čermáková 1
- Michal Škrabal 1