Automatic scoring of short answers using justification cues estimated by BERT

Shunya Takano, Osamu Ichikawa


Abstract
Automated scoring technology for short-answer questions has been attracting attention to improve the fairness of scoring and reduce the burden on the scorer. In general, a large amount of data is required to train an automated scoring model. The training data consists of the answer texts and the scoring data assigned to them. It may also include annotations indicating key word sequences. These data must be prepared manually, which is costly. Many previous studies have created models with large amounts of training data specific to each question. This paper aims to achieve equivalent performance with less training data by utilizing a BERT model that has been pre-trained on a large amount of general text data not necessarily related to short answer questions. On the RIKEN dataset, the proposed method reduces the training data from the 800 data required in the past to about 400 data, and still achieves scoring accuracy comparable to that of humans.
Anthology ID:
2022.bea-1.2
Volume:
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
Month:
July
Year:
2022
Address:
Seattle, Washington
Venues:
BEA | NAACL
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
8–13
Language:
URL:
https://aclanthology.org/2022.bea-1.2
DOI:
10.18653/v1/2022.bea-1.2
Bibkey:
Cite (ACL):
Shunya Takano and Osamu Ichikawa. 2022. Automatic scoring of short answers using justification cues estimated by BERT. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 8–13, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Automatic scoring of short answers using justification cues estimated by BERT (Takano & Ichikawa, BEA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.bea-1.2.pdf