Yanbin Fu
2025
Text-Based Approaches to Item Alignment to Content Standards in Large-Scale Reading & Writing Tests
Yanbin Fu
|
Hong Jiao
|
Tianyi Zhou
|
Nan Zhang
|
Ming Li
|
Qingshu Xu
|
Sydney Peters
|
Robert W Lissitz
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Coordinated Session Papers
Aligning test items to content standards is a critical step in test development to collect validity evidence 3 based on content. Item alignment has typically been conducted by human experts, but this judgmental process can be subjective and time-consuming. This study investigated the performance of fine-tuned small language models (SLMs) for automated item alignment using data from a large-scale standardized reading and writing test for college admissions. Different SLMs were trained for both domain and skill alignment. The model performance was evaluated using precision, recall, accuracy, weighted F1 score, and Cohen’s kappa on two test sets. The impact of input data types and training sample sizes was also explored. Results showed that including more textual inputs led to better performance gains than increasing sample size. For comparison, classic supervised machine learning classifiers were trained on multilingual-E5 embedding. Fine-tuned SLMs consistently outperformed these models, particularly for fine-grained skill alignment. To better understand model classifications, semantic similarity analyses including cosine similarity, Kullback-Leibler divergence of embedding distributions, and two-dimension projections of item embedding revealed that certain skills in the two test datasets were semantically too close, providing evidence for the observed misclassification patterns.
Search
Fix author
Co-authors
- Hong Jiao 1
- Ming Li 1
- Robert W Lissitz 1
- Sydney Peters 1
- Qingshu Xu 1
- show all...