Marius Micluța-Câmpeanu

Also published as: Marius Micluta-Campeanu, Marius Micluta - Campeanu


2026

The ubiquitous adoption of large language models by students prompts teachers to redesign courses and evaluation methods, especially in computer science and natural language processing (NLP) where the impact is more tangible.Our contribution is two-fold. First, we attempt to define invariants for the role of education itself given the over-abundance of information that appears to be more accessible than ever before. Then, we present our approach and materials used for an introductory course in NLP for undergraduate students, drawing inspiration from software engineering best practices. Our vision regarding large language models is torely on local models to cultivate a sense of ownership and sovereignty in an age where every bit of independence and privacy get eroded.

2025

This paper presents a data-driven analysis of Romanian secondary school textbooks through the lens of Bloom’s Taxonomy, focusing on the promotion of critical thinking in instructional design. Using the ROTEX corpus, we extract and annotate almost 2 million words of Romanian Language and Literature textbooks (grades 5-8) with Bloom-aligned labels for verbs associated with pedagogical tasks. Our annotation pipeline combines automatic verb extraction, human filtering based on syntactic form and task relevance, and manual assignment of Bloom labels supported by in-text concordance checks. The resulting dataset enables fine-grained analysis of task complexity both across and within textbooks and grade levels. Our findings reveal a general lack of structured cognitive progression across most textbook series. We also propose a multi-dimensional framework combining cognitive-level and linguistic evaluation to assess instructional design quality. This work contributes annotated resources and reproducible methods for NLP-based educational content analysis in low-resource languages.
In this paper, we present a similarity-based method for explainable classification in the context of the SemEval 2025 Task 9: The Food Hazard Detection Challenge. Our proposed system is essentially unsupervised, leveraging the semantic properties of the labels. This approach brings some key advantages over typical classification systems. First, similarity metrics offer a more intuitive interpretation. Next, this technique allows for inference on novel labels. Finally, there is a non-negligible amount of ambiguous labels, so learning a direct mapping does not lead to meaningful representations.Our team ranks 13th for the second sub-task among participants that used only the title and the text as features. Our method is generic and can be applied to any classification task.
The following paper is a joint contribution for the 2025 ReproNLP shared task, part of the ReproHum project. We focused on reproducing the human evaluation based on one criterion, namely, factuality of Scientific Automated Generated Systems from August et al. (2022). In accordance to the ReproHum guidelines, we followed the original study as closely as possible, with two human raters who coded 300 ratings each. Moreover, we had an additional study on two subsets of the dataset based on domain (medicine and physics) in which we employed expert annotators. Our reproduction of the factuality assessment found similar overall rates of factual inaccuracies across models. However, variability and weak agreement with the original model rankings suggest challenges in reliably reproducing results, especially in such cases when results are close.

2024

This paper describes the approach of the UniBuc team in tackling the SemEval 2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. We used SOLAR Instruct, without any fine-tuning, while focusing on input manipulation and tailored prompting. By customizing prompts for individual CTR sections, in both zero-shot and few-shots settings, we managed to achieve a consistency score of 0.72, ranking 14th in the leaderboard. Our thorough error analysis revealed that our model has a tendency to take shortcuts and rely on simple heuristics, especially when dealing with semantic-preserving changes.
The following paper presents the outcomes of a collaborative experiment on human evaluation from the ReproNLP 2024 shared task, track B, part of the ReproHum project. For this paper, we evaluated a QAG (question-answer generation) system centered on English children’s storybooks that was presented in a previous research, by using human evaluators for the study. The system generated relevant QA (Question-Answer) pairs based on a dataset with storybooks for early education (kindergarten up to middle school) called FairytaleQA. In the framework of the ReproHum project, we first outline the previous paper and the reproduction strategy that has been decided upon. The complete setup of the first human evaluation is then described, along with the modifications required to replicate it. We also add other relevant related works on this subject. In conclusion, we juxtapose the replication outcomes with those documented in the cited publication. Additionally, we explore the general features of this endeavor as well as its shortcomings.