Enhancing Textbooks with Visuals from the Web for Improved Learning

Janvijay Singh; Vilém Zouhar; Mrinmaya Sachan

doi:10.18653/v1/2023.emnlp-main.731

Enhancing Textbooks with Visuals from the Web for Improved Learning

Janvijay Singh, Vilém Zouhar, Mrinmaya Sachan

Abstract

Textbooks are one of the main mediums for delivering high-quality education to students. In particular, explanatory and illustrative visuals play a key role in retention, comprehension and general transfer of knowledge. However, many textbooks lack these interesting visuals to support student learning. In this paper, we investigate the effectiveness of vision-language models to automatically enhance textbooks with images from the web. We collect a dataset of e-textbooks in the math, science, social science and business domains. We then set up a text-image matching task that involves retrieving and appropriately assigning web images to textbooks, which we frame as a matching optimization problem. Through a crowd-sourced evaluation, we verify that (1) while the original textbook images are rated higher, automatically assigned ones are not far behind, and (2) the precise formulation of the optimization problem matters. We release the dataset of textbooks with an associated image bank to inspire further research in this intersectional area of computer vision and NLP for education.

Anthology ID:: 2023.emnlp-main.731
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11931–11944
Language:
URL:: https://aclanthology.org/2023.emnlp-main.731/
DOI:: 10.18653/v1/2023.emnlp-main.731
Bibkey:
Cite (ACL):: Janvijay Singh, Vilém Zouhar, and Mrinmaya Sachan. 2023. Enhancing Textbooks with Visuals from the Web for Improved Learning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 11931–11944, Singapore. Association for Computational Linguistics.
Cite (Informal):: Enhancing Textbooks with Visuals from the Web for Improved Learning (Singh et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.731.pdf
Video:: https://aclanthology.org/2023.emnlp-main.731.mp4

PDF Cite Search Video Fix data