Cross Encoding as Augmentation: Towards Effective Educational Text Classification

Hyun Seung Lee; Seungtaek Choi; Yunsung Lee; Hyeongdon Moon; Shinhyeok Oh; Myeongho Jeong; Hyojun Go; Christian Wallraven

doi:10.18653/v1/2023.findings-acl.137

Cross Encoding as Augmentation: Towards Effective Educational Text Classification

Hyun Seung Lee, Seungtaek Choi, Yunsung Lee, Hyeongdon Moon, Shinhyeok Oh, Myeongho Jeong, Hyojun Go, Christian Wallraven

Abstract

Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which stems from two major challenges: 1) it possesses a large tag space and 2) it is multi-label. Though a retrieval approach is reportedly good at low-resource scenarios, there have been fewer efforts to directly address the data scarcity problem. To mitigate these issues, here we propose a novel retrieval approach CEAA that provides effective learning in educational text classification. Our main contributions are as follows: 1) we leverage transfer learning from question-answering datasets, and 2) we propose a simple but effective data augmentation method introducing cross-encoder style texts to a bi-encoder architecture for more efficient inference. An extensive set of experiments shows that our proposed method is effective in multi-label scenarios and low-resource tags compared to state-of-the-art models.

Anthology ID:: 2023.findings-acl.137
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2184–2195
Language:
URL:: https://aclanthology.org/2023.findings-acl.137
DOI:: 10.18653/v1/2023.findings-acl.137
Bibkey:
Cite (ACL):: Hyun Seung Lee, Seungtaek Choi, Yunsung Lee, Hyeongdon Moon, Shinhyeok Oh, Myeongho Jeong, Hyojun Go, and Christian Wallraven. 2023. Cross Encoding as Augmentation: Towards Effective Educational Text Classification. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2184–2195, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Cross Encoding as Augmentation: Towards Effective Educational Text Classification (Lee et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.137.pdf

PDF Cite Search