2022
pdf
bib
abs
National Language Technology Platform for Public Administration
Marko Tadić
|
Daša Farkaš
|
Matea Filko
|
Artūrs Vasiļevskis
|
Andrejs Vasiļjevs
|
Jānis Ziediņš
|
Željka Motika
|
Mark Fishel
|
Hrafn Loftsson
|
Jón Guðnason
|
Claudia Borg
|
Keith Cortis
|
Judie Attard
|
Donatienne Spiteri
Proceedings of the Workshop Towards Digital Language Equality within the 13th Language Resources and Evaluation Conference
This article presents the work in progress on the collaborative project of several European countries to develop National Language Technology Platform (NLTP). The project aims at combining the most advanced Language Technology tools and solutions in a new, state-of-the-art, Artificial Intelligence driven, National Language Technology Platform for five EU/EEA official and lower-resourced languages.
2019
pdf
bib
Redesign of the Croatian derivational lexicon
Matea Filko
|
Krešimir Šojat
|
Vanja Štefanec
Proceedings of the Second International Workshop on Resources and Tools for Derivational Morphology
2018
pdf
bib
abs
Further expansion of the Croatian WordNet
Krešimir Šojat
|
Matea Filko
|
Antoni Oliver
Proceedings of the 9th Global Wordnet Conference
In this paper a semi-automatic procedure for the expansion of the Croatian Wordnet (CroWN) is presented. An English-Croatian dictionary was used in order to translate monosemous PWN 3.0 English variants. The precision values of the automatic process is low (about 30%), but the results proved valuable for the enlargment of CroWN. After manual validation, 10,884 new synset-variant pairs were added to CroWN, achieving a total of 62,075 synset-variant pairs.
2016
pdf
bib
abs
HR4EU – Using Language Resources in Computer Aided Language Learning
Daša Farkaš
|
Matea Filko
|
Marko Tadić
Proceedings of the Second International Conference on Computational Linguistics in Bulgaria (CLIB 2016)
In this paper we present the HR4EU – web portal for e-learning of Croatian language. The web portal offers a new method of computer aided language learning (CALL) by encouraging language learners to use different language resources available for Croatian: corpora, inflectional and derivational morphological lexicons, treebank, Wordnet, etc. Apart from the previously developed language resources, the new ones are created in order to further facilitate the learning of Croatian language. We will focus on the usage of the treebank annotated at syntactic and semantic level in the CALL and describe the new HR4EU sub-corpus of the Croatian Dependency Treebank (HOBS). The HR4EU sub-corpus consists of approx. 550 sentences, which are manually annotated on syntactic and semantic role level according to the specifications used for the HOBS. The syntactic and the semantic structure of the sentence can be visualized as a dependency tree via the SynSem Visualizer. The visualization of the syntactic and the semantic structure of sentences will help users to produce syntactically and semantically correct sentences on their own.
pdf
bib
abs
Verbal Multiword Expressions in Croatian
Krešimir Šojat
|
Matea Filko
|
Daša Farkaš
Proceedings of the Second International Conference on Computational Linguistics in Bulgaria (CLIB 2016)
The paper deals with verbal multiword expressions in Croatian. We focus on four types of verbal constructions: light verb constructions, i.e. constructions consisting of a light verb and a noun or prepositional phrase, complex predicate constructions, i.e. constructions consisting of a finite and infinitive verb, prepositional verb constructions, i.e. constructions consisting of a verb and a typical preposition, and, finally, verbal idioms, i.e. constructions with completely idiosyncratic meanings. All the constructions are annotated in the Universal Dependency treebank for Croatian. The identification of verbal multiword expressions is an important task in numerous NLP tasks. It is also important to define and delimitate this concept in linguistic theory.