Anastasiia Bashmakova
2022
TAPE: Assessing Few-shot Russian Language Understanding
Ekaterina Taktasheva
|
Alena Fenogenova
|
Denis Shevelev
|
Nadezhda Katricheva
|
Maria Tikhonova
|
Albina Akhmetgareeva
|
Oleg Zinkevich
|
Anastasiia Bashmakova
|
Svetlana Iordanskaia
|
Valentina Kurenshchikova
|
Alena Spiridonova
|
Ekaterina Artemova
|
Tatiana Shavrina
|
Vladislav Mikhailov
Findings of the Association for Computational Linguistics: EMNLP 2022
Recent advances in zero-shot and few-shot learning have shown promise for a scope of research and practical purposes. However, this fast-growing area lacks standardized evaluation suites for non-English languages, hindering progress outside the Anglo-centric paradigm. To address this line of research, we propose TAPE (Text Attack and Perturbation Evaluation), a novel benchmark that includes six more complex NLU tasks for Russian, covering multi-hop reasoning, ethical concepts, logic and commonsense knowledge. The TAPE’s design focuses on systematic zero-shot and few-shot NLU evaluation: (i) linguistic-oriented adversarial attacks and perturbations for analyzing robustness, and (ii) subpopulations for nuanced interpretation. The detailed analysis of testing the autoregressive baselines indicates that simple spelling-based perturbations affect the performance the most, while paraphrasing the input has a more negligible effect. At the same time, the results demonstrate a significant gap between the neural and human baselines for most tasks. We publicly release TAPE (https://tape-benchmark.com) to foster research on robust LMs that can generalize to new tasks when little to no supervision is available.
2020
Modelling Narrative Elements in a Short Story: A Study on Annotation Schemes and Guidelines
Elena Mikhalkova
|
Timofei Protasov
|
Polina Sokolova
|
Anastasiia Bashmakova
|
Anastasiia Drozdova
Proceedings of the Twelfth Language Resources and Evaluation Conference
Text-processing algorithms that annotate main components of a story-line are presently in great need of corpora and well-agreed annotation schemes. The Text World Theory of cognitive linguistics offers a model that generalizes a narrative structure in the form of world building elements (characters, time and space) as well as text worlds themselves and switches between them. We have conducted a survey on how text worlds and their elements are annotated in different projects and proposed our own annotation scheme and instructions. We tested them, first, on the science fiction story “We Can Remember It for You Wholesale” by Philip K. Dick. Then we corrected the guidelines and added computer annotation of verb forms with the purpose to get a higher raters’ agreement and tested them again on the short story “The Gift of the Magi” by O. Henry. As a result, the agreement among the three raters has risen. With due revision and tests, our annotation scheme and guidelines can be used for annotating narratives in corpora of literary texts, criminal evidence, teaching materials, quests, etc.