Milena Slavcheva


2023

pdf bib
Proceedings of the 8th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing
Momchil Hardalov | Zara Kancheva | Boris Velichkov | Ivelina Nikolova-Koleva | Milena Slavcheva
Proceedings of the 8th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing

pdf bib
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text
Ali Hürriyetoğlu | Hristo Tanev | Vanni Zavarella | Reyyan Yeniterzi | Erdem Yörük | Milena Slavcheva
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text

pdf bib
On the Road to a Protest Event Ontology for Bulgarian: Conceptual Structures and Representation Design
Milena Slavcheva | Hristo Tanev | Onur Uca
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text

The paper presents a semantic model of protest events, called Semantic Interpretations of Protest Events (SemInPE). The analytical framework used for building the semantic representations is inspired by the object-oriented paradigm in computer science and a cognitive approach to the linguistic analysis. The model is a practical application of the Unified Eventity Representation (UER) formalism, which is based on the Unified Modeling Language (UML). The multi-layered architecture of the model provides flexible means for building the semantic representations of the language objects along a scale of generality and specificity. Thus, it is a suitable environment for creating the elements of ontologies on various topics and for different languages.

2022

pdf bib
Extended Multilingual Protest News Detection - Shared Task 1, CASE 2021 and 2022
Ali Hürriyetoğlu | Osman Mutlu | Fırat Duruşan | Onur Uca | Alaeddin Gürel | Benjamin J. Radford | Yaoyao Dai | Hansi Hettiarachchi | Niklas Stoehr | Tadashi Nomoto | Milena Slavcheva | Francielle Vargas | Aaqib Javid | Fatih Beyhan | Erdem Yörük
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)

We report results of the CASE 2022 Shared Task 1 on Multilingual Protest Event Detection. This task is a continuation of CASE 2021 that consists of four subtasks that are i) document classification, ii) sentence classification, iii) event sentence coreference identification, and iv) event extraction. The CASE 2022 extension consists of expanding the test data with more data in previously available languages, namely, English, Hindi, Portuguese, and Spanish, and adding new test data in Mandarin, Turkish, and Urdu for Sub-task 1, document classification. The training data from CASE 2021 in English, Portuguese and Spanish were utilized. Therefore, predicting document labels in Hindi, Mandarin, Turkish, and Urdu occurs in a zero-shot setting. The CASE 2022 workshop accepts reports on systems developed for predicting test data of CASE 2021 as well. We observe that the best systems submitted by CASE 2022 participants achieve between 79.71 and 84.06 F1-macro for new languages in a zero-shot setting. The winning approaches are mainly ensembling models and merging data in multiple languages. The best two submissions on CASE 2021 data outperform submissions from last year for Subtask 1 and Subtask 2 in all languages. Only the following scenarios were not outperformed by new submissions on CASE 2021: Subtask 3 Portuguese & Subtask 4 English.

2021

pdf bib
Monitoring Fact Preservation, Grammatical Consistency and Ethical Behavior of Abstractive Summarization Neural Models
Iva Marinova | Yolina Petrova | Milena Slavcheva | Petya Osenova | Ivaylo Radev | Kiril Simov
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

The paper describes a system for automatic summarization in English language of online news data that come from different non-English languages. The system is designed to be used in production environment for media monitoring. Automatic summarization can be very helpful in this domain when applied as a helper tool for journalists so that they can review just the important information from the news channels. However, like every software solution, the automatic summarization needs performance monitoring and assured safe environment for the clients. In media monitoring environment the most problematic features to be addressed are: the copyright issues, the factual consistency, the style of the text and the ethical norms in journalism. Thus, the main contribution of our present work is that the above mentioned characteristics are successfully monitored in neural automatic summarization models and improved with the help of validation, fact-preserving and fact-checking procedures.

2013

pdf bib
Proceedings of the Workshop on Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants
Cristina Vertan | Milena Slavcheva | Petya Osenova
Proceedings of the Workshop on Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants

2011

pdf bib
Proceedings of the Workshop on Language Technologies for Digital Humanities and Cultural Heritage
Cristina Vertan | Milena Slavcheva | Petya Osenova | Stelios Piperidis
Proceedings of the Workshop on Language Technologies for Digital Humanities and Cultural Heritage

2009

pdf bib
Proceedings of the Workshop Multilingual resources, technologies and evaluation for central and Eastern European languages
Elena Paskaleva | Stelios Piperidis | Milena Slavcheva | Cristina Vertan
Proceedings of the Workshop Multilingual resources, technologies and evaluation for central and Eastern European languages

2006

pdf bib
Semantic Descriptors: The Case of Reflexive Verbs
Milena Slavcheva
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents a semantic classification of reflexive verbs in Bulgarian, augmenting the morphosyntactic classes of verbs in the large Bulgarian Lexical Data Base - a language resource utilized in a number of Language Engineering (LE) applications. Thesemantic descriptors conform to the Unified Eventity Representation (UER), developed by Andrea Schalley. The UER is a graphical formalism, introducing the object-oriented system design to linguistic semantics. Reflexive/non-reflexive verb pairs are analyzed where the non-reflexive member of the opposition, a two-place predicate, is considered the initial linguistic entity from which the reflexive correlate is derived. The reflexive verbs are distributed into initial syntactic-semantic classes which serve as the basis for defining the relevant semantic descriptors in the form of EVENTITY FRAME diagrams. The factors that influence the categorization of the reflexives are the lexical paradigmaticapproach to the data, the choice of only one reading for each verb, top level generalization of the semantic descriptors. The language models described in this paper provide the possibility for building linguistic components utilizable in knowledge-driven systems.

2004

pdf bib
Verb Valency Descriptors for a Syntactic Treebank
Milena Slavcheva
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Some Aspects of the Morphological Processing of Bulgarian
Milena Slavcheva
Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages

2002

pdf bib
Building a Linguistically Interpreted Corpus of Bulgarian: the BulTreeBank
Kiril Simov | Petya Osenova | Milena Slavcheva | Sia Kolkovska | Elisaveta Balabanova | Dimitar Doikoff | Krassimira Ivanova | Alexander Simov | Milen Kouylekov
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

1993

pdf bib
The Long Journey from the Core to the Real Size of Large LDBs
Elena Paskaleva | Kiril Simov | Mariana Damova | Milena Slavcheva
Acquisition of Lexical Knowledge from Text