Ekaterina Loginova


pdf bib
Structural information in mathematical formulas for exercise difficulty prediction: a comparison of NLP representations
Ekaterina Loginova | Dries Benoit
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

To tailor a learning system to the student’s level and needs, we must consider the characteristics of the learning content, such as its difficulty. While natural language processing allows us to represent text efficiently, the meaningful representation of mathematical formulas in an educational context is still understudied. This paper adopts structural embeddings as a possible way to bridge this gap. Our experiments validate the approach using publicly available datasets to show that incorporating syntactic information can improve performance in predicting the exercise difficulty.


pdf bib
Towards the Application of Calibrated Transformers to the Unsupervised Estimation of Question Difficulty from Text
Ekaterina Loginova | Luca Benedetto | Dries Benoit | Paolo Cremonesi
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Being able to accurately perform Question Difficulty Estimation (QDE) can improve the accuracy of students’ assessment and better their learning experience. Traditional approaches to QDE are either subjective or introduce a long delay before new questions can be used to assess students. Thus, recent work proposed machine learning-based approaches to overcome these limitations. They use questions of known difficulty to train models capable of inferring the difficulty of questions from their text. Once trained, they can be used to perform QDE of newly created questions. Existing approaches employ supervised models which are domain-dependent and require a large dataset of questions of known difficulty for training. Therefore, they cannot be used if such a dataset is not available ( for new courses on an e-learning platform). In this work, we experiment with the possibility of performing QDE from text in an unsupervised manner. Specifically, we use the uncertainty of calibrated question answering models as a proxy of human-perceived difficulty. Our experiments show promising results, suggesting that model uncertainty could be successfully leveraged to perform QDE from text, reducing both costs and elapsed time.


pdf bib
Code-Mixed Question Answering Challenge: Crowd-sourcing Data and Techniques
Khyathi Chandu | Ekaterina Loginova | Vishal Gupta | Josef van Genabith | Günter Neumann | Manoj Chinnakotla | Eric Nyberg | Alan W. Black
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching

Code-Mixing (CM) is the phenomenon of alternating between two or more languages which is prevalent in bi- and multi-lingual communities. Most NLP applications today are still designed with the assumption of a single interaction language and are most likely to break given a CM utterance with multiple languages mixed at a morphological, phrase or sentence level. For example, popular commercial search engines do not yet fully understand the intents expressed in CM queries. As a first step towards fostering research which supports CM in NLP applications, we systematically crowd-sourced and curated an evaluation dataset for factoid question answering in three CM languages - Hinglish (Hindi+English), Tenglish (Telugu+English) and Tamlish (Tamil+English) which belong to two language families (Indo-Aryan and Dravidian). We share the details of our data collection process, techniques which were used to avoid inducing lexical bias amongst the crowd workers and other CM specific linguistic properties of the dataset. Our final dataset, which is available freely for research purposes, has 1,694 Hinglish, 2,848 Tamlish and 1,391 Tenglish factoid questions and their answers. We discuss the techniques used by the participants for the first edition of this ongoing challenge.

pdf bib
An Interactive Web-Interface for Visualizing the Inner Workings of the Question Answering LSTM
Ekaterina Loginova | Günter Neumann
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present a visualisation tool which aims to illuminate the inner workings of an LSTM model for question answering. It plots heatmaps of neurons’ firings and allows a user to check the dependency between neurons and manual features. The system possesses an interactive web-interface and can be adapted to other models and domains.