2020
pdf
bib
abs
Creating Expert Knowledge by Relying on Language Learners: a Generic Approach for Mass-Producing Language Resources by Combining Implicit Crowdsourcing and Language Learning
Lionel Nicolas
|
Verena Lyding
|
Claudia Borg
|
Corina Forascu
|
Karën Fort
|
Katerina Zdravkova
|
Iztok Kosem
|
Jaka Čibej
|
Špela Arhar Holdt
|
Alice Millour
|
Alexander König
|
Christos Rodosthenous
|
Federico Sangati
|
Umair ul Hassan
|
Anisia Katinskaia
|
Anabela Barreiro
|
Lavinia Aparaschivei
|
Yaakov HaCohen-Kerner
Proceedings of the Twelfth Language Resources and Evaluation Conference
We introduce in this paper a generic approach to combine implicit crowdsourcing and language learning in order to mass-produce language resources (LRs) for any language for which a crowd of language learners can be involved. We present the approach by explaining its core paradigm that consists in pairing specific types of LRs with specific exercises, by detailing both its strengths and challenges, and by discussing how much these challenges have been addressed at present. Accordingly, we also report on on-going proof-of-concept efforts aiming at developing the first prototypical implementation of the approach in order to correct and extend an LR called ConceptNet based on the input crowdsourced from language learners. We then present an international network called the European Network for Combining Language Learning with Crowdsourcing Techniques (enetCollect) that provides the context to accelerate the implementation of this generic approach. Finally, we exemplify how it can be used in several language learning scenarios to produce a multitude of NLP resources and how it can therefore alleviate the long-standing NLP issue of the lack of LRs.
pdf
bib
abs
Using Crowdsourced Exercises for Vocabulary Training to Expand ConceptNet
Christos Rodosthenous
|
Verena Lyding
|
Federico Sangati
|
Alexander König
|
Umair ul Hassan
|
Lionel Nicolas
|
Jolita Horbacauskiene
|
Anisia Katinskaia
|
Lavinia Aparaschivei
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this work, we report on a crowdsourcing experiment conducted using the V-TREL vocabulary trainer which is accessed via a Telegram chatbot interface to gather knowledge on word relations suitable for expanding ConceptNet. V-TREL is built on top of a generic architecture implementing the implicit crowdsourding paradigm in order to offer vocabulary training exercises generated from the commonsense knowledge-base ConceptNet and – in the background – to collect and evaluate the learners’ answers to extend ConceptNet with new words. In the experiment about 90 university students learning English at C1 level, based on Common European Framework of Reference for Languages (CEFR), trained their vocabulary with V-TREL over a period of 16 calendar days. The experiment allowed to gather more than 12,000 answers from learners on different question types. In this paper we present in detail the experimental setup and the outcome of the experiment, which indicates the potential of our approach for both crowdsourcing data as well as fostering vocabulary skills.
pdf
bib
abs
FII-UAIC at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text Using CNN
Lavinia Aparaschivei
|
Andrei Palihovici
|
Daniela Gîfu
Proceedings of the Fourteenth Workshop on Semantic Evaluation
The “Sentiment Analysis for Code-Mixed Social Media Text” task at the SemEval 2020 competition focuses on sentiment analysis in code-mixed social media text , specifically, on the combination of English with Spanish (Spanglish) and Hindi (Hinglish). In this paper, we present a system able to classify tweets, from Spanish and English languages, into positive, negative and neutral. Firstly, we built a classifier able to provide corresponding sentiment labels. Besides the sentiment labels, we provide the language labels at the word level. Secondly, we generate a word-level representation, using Convolutional Neural Network (CNN) architecture. Our solution indicates promising results for the Sentimix Spanglish-English task (0.744), the team, Lavinia_Ap, occupied the 9th place. However, for the Sentimix Hindi-English task (0.324) the results have to be improved.