Rory Devine
2021
Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability
Venelin Kovatchev
|
Phillip Smith
|
Mark Lee
|
Rory Devine
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
In this paper we implement and compare 7 different data augmentation strategies for the task of automatic scoring of children’s ability to understand others’ thoughts, feelings, and desires (or “mindreading”). We recruit in-domain experts to re-annotate augmented samples and determine to what extent each strategy preserves the original rating. We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. To determine the capabilities of automatic systems to generalize to unseen data, we create UK-MIND-20 - a new corpus of children’s performance on tests of mindreading, consisting of 10,320 question-answer pairs. We obtain a new state-of-the-art performance on the MIND-CA corpus, improving macro-F1-score by 6 points. Results indicate that both the number of training examples and the quality of the augmentation strategies affect the performance of the systems. The task-specific augmentations generally outperform task-agnostic augmentations. Automatic augmentations based on vectors (GloVe, FastText) perform the worst. We find that systems trained on MIND-CA generalize well to UK-MIND-20. We demonstrate that data augmentation strategies also improve the performance on unseen data.
2020
“What is on your mind?” Automated Scoring of Mindreading in Childhood and Early Adolescence
Venelin Kovatchev
|
Phillip Smith
|
Mark Lee
|
Imogen Grumley Traynor
|
Irene Luque Aguilera
|
Rory Devine
Proceedings of the 28th International Conference on Computational Linguistics
In this paper we present the first work on the automated scoring of mindreading ability in middle childhood and early adolescence. We create MIND-CA, a new corpus of 11,311 question-answer pairs in English from 1,066 children aged from 7 to 14. We perform machine learning experiments and carry out extensive quantitative and qualitative evaluation. We obtain promising results, demonstrating the applicability of state-of-the-art NLP solutions to a new domain and task.