Textual Coverage of Eventive Entries in Lexical Semantic Resources

Eva Fučíková, Cristina Fernández Alcaina, Jan Hajič, Zdeňka Urešová


Abstract
This short paper focuses on the coverage of eventive entries (verbs, predicates, etc.) of some well-known lexical semantic resources when applied to random running texts taken from the internet. While coverage gaps are often reported for manually created lexicons (which is the case of most semantically-oriented lexical ones), it was our aim to quantify these gaps, cross-lingually, on a new purely textual resource set produced by the HPLT Project from crawled internet data. Several English, German, Spanish and Czech lexical semantic resources (which, for the most part, focus on verbs and predicates) have been selected for this experiment. We also describe the challenges related to the fact that these resources are (to a varying extent) semantically oriented, meaning that the texts have to be preprocessed to obtain lemmas (base forms) and some types of MWEs before the coverage can be reasonably evaluated, and thus the results are necessarily only approximate. The coverage of these resources, with some exclusions as described in the paper, range from 41.00% to 97.33%, confirming the need to expand at least some - even well-known - resources to cover the prevailing source of today’s textual resources with regard to lexical units describing events or states (or possibly other eventive mentions).
Anthology ID:
2024.lrec-main.1375
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
15835–15841
Language:
URL:
https://aclanthology.org/2024.lrec-main.1375
DOI:
Bibkey:
Cite (ACL):
Eva Fučíková, Cristina Fernández Alcaina, Jan Hajič, and Zdeňka Urešová. 2024. Textual Coverage of Eventive Entries in Lexical Semantic Resources. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 15835–15841, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Textual Coverage of Eventive Entries in Lexical Semantic Resources (Fučíková et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.1375.pdf