Pranshu Rastogi

2025

Phaedrus at BEA 2025 Shared Task: Assessment of Mathematical Tutoring Dialogues through Tutor Identity Classification and Actionability Evaluation
Rajneesh Tiwari | Pranshu Rastogi
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

As Large Language Models (LLMs) are increasingly deployed in educational environments, two critical challenges emerge: identifying the source of tutoring responses and evaluating their pedagogical effectiveness. This paper presents our comprehensive approach to the BEA 2025 Shared Task, addressing both tutor identity classification (Track 5) and actionability assessment (Track 4) in mathematical tutoring dialogues. For tutor identity classification, we distinguish between human tutors (expert/novice) and seven distinct LLMs using cross-response context augmentation and ensemble techniques. For actionability assessment, we evaluate whether responses provide clear guidance on student next steps using selective attention masking and instruction-guided training. Our multi-task approach combines transformer-based models with innovative contextual feature engineering, achieving state-of-the-art performance with a CV macro F1 score of 0.9596 (test set 0.9698) for identity classification and 0.655 (test set Strict F1 0.6906) for actionability assessment. We were able to score rank 5th in Track 4 and rank 1st in Track 5. Our analysis reveals that despite advances in human-like responses, LLMs maintain detectable fingerprints while showing varying levels of pedagogical actionability, with important implications for educational technology development and deployment.

pdf bib abs

Extracting Software Mentions and Relations using Transformers and LLM-Generated Synthetic Data at SOMD 2025
Pranshu Rastogi | Rajneesh Tiwari
Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025)

As part of the SOMD 2025 shared task on Software Mention Detection, we solved the problem of detecting and disambiguating software mentions in academic texts. a very important but under appreciated factor in research transparency and reproducibility. Software is an essential building block of scientific activity, but it often does not receive official citation in scholarly literature, and there are many informal mentions that are hard to follow and analyse. In order to enhance research accessibility and interpretability, we built a system that identifies software mentions and their properties (e.g., version numbers, URLs) as named entities, and classify relationships between them. Our dataset contained approximately 1,100 manually annotated sentences of full-text scholarly articles, representing diverse types of software like operating systems and applications. We fine-tuned DeBERTa based models for the Named Entity Recognition (NER) task and handled Relation Extraction (RE) as a classification problem over entity pairs. Due to the dataset size, we employed Large Language Models to create synthetic training data for augmentation. Our system achieved strong performance, with a 65% F1 score on NER (ranking 2nd in test phase) and a 47% F1 score on RE and combined macro 56% F1, showing the performance of our approach in this area.

pdf bib abs

fact check AI at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-checked Claim Retrieval
Pranshu Rastogi
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

The SemEval-2025 Task 7 on Multilingualand Crosslingual Fact-checked Claim Retrievalfocuses on retrieving relevant Fact-checkedclaims for social media Posts across multiplelanguages. This task is particularly challengingdue to linguistic barriers and the vast numberof languages Fact-checkers must consider.In this work, I approach the problem as aLearning-to-Rank task and solve it using abi-encoder-based model, fine-tuned on a pre-trained transformer optimized for sentence sim-ilarity. For the monolingual task, training wasperformed in both the source languages andtheir English translations. For cross-lingualretrieval, the training relied on English transla-tions.Most fine-tuned models have fewer than 500Mparameters, and the training was carried outefficiently using kaggle T4 GPUs with paral-lelization. Despite this lightweight setup, ourapproach achieved 92% Success@10 for mul-tilingual retrieval and 80% Success@10 forcross-lingual retrieval, securing 5th place inthe cross-lingual track and 10th place in themultilingual setting.

Co-authors

Rajneesh Tiwari 2

Venues

Fix author