Annotations for Exploring Food Tweets from Multiple Aspects

Matiss Rikters, Rinalds Vīksna, Edison Marrese-Taylor


Abstract
This research builds upon the Latvian Twitter Eater Corpus (LTEC), which is focused on the narrow domain of tweets related to food, drinks, eating and drinking. LTEC has been collected for more than 12 years and reaching almost 3 million tweets with the basic information as well as extended automatically and manually annotated metadata. In this paper we supplement the LTEC with manually annotated subsets of evaluation data for machine translation, named entity recognition, timeline-balanced sentiment analysis, and text-image relation classification. We experiment with each of the data sets using baseline models and highlight future challenges for various modelling approaches.
Anthology ID:
2024.lrec-main.111
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
1233–1238
Language:
URL:
https://aclanthology.org/2024.lrec-main.111
DOI:
Bibkey:
Cite (ACL):
Matiss Rikters, Rinalds Vīksna, and Edison Marrese-Taylor. 2024. Annotations for Exploring Food Tweets from Multiple Aspects. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 1233–1238, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Annotations for Exploring Food Tweets from Multiple Aspects (Rikters et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.111.pdf
Optional supplementary material:
 2024.lrec-main.111.OptionalSupplementaryMaterial.zip