GEM: Gestalt Enhanced Markup Language Model for Web Understanding via Render Tree

Zirui Shao; Feiyu Gao; Zhongda Qi; Hangdi Xing; Jiajun Bu; Zhi Yu; Qi Zheng; Xiaozhong Liu

doi:10.18653/v1/2023.emnlp-main.375

GEM: Gestalt Enhanced Markup Language Model for Web Understanding via Render Tree

Zirui Shao, Feiyu Gao, Zhongda Qi, Hangdi Xing, Jiajun Bu, Zhi Yu, Qi Zheng, Xiaozhong Liu

Abstract

Inexhaustible web content carries abundant perceptible information beyond text. Unfortunately, most prior efforts in pre-trained Language Models (LMs) ignore such cyber-richness, while few of them only employ plain HTMLs, and crucial information in the rendered web, such as visual, layout, and style, are excluded. Intuitively, those perceptible web information can provide essential intelligence to facilitate content understanding tasks. This study presents an innovative Gestalt Enhanced Markup (GEM) Language Model inspired by Gestalt psychological theory for hosting heterogeneous visual information from the render tree into the language model without requiring additional visual input. Comprehensive experiments on multiple downstream tasks, i.e., web question answering and web information extraction, validate GEM superiority.

Anthology ID:: 2023.emnlp-main.375
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6132–6145
Language:
URL:: https://aclanthology.org/2023.emnlp-main.375/
DOI:: 10.18653/v1/2023.emnlp-main.375
Bibkey:
Cite (ACL):: Zirui Shao, Feiyu Gao, Zhongda Qi, Hangdi Xing, Jiajun Bu, Zhi Yu, Qi Zheng, and Xiaozhong Liu. 2023. GEM: Gestalt Enhanced Markup Language Model for Web Understanding via Render Tree. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6132–6145, Singapore. Association for Computational Linguistics.
Cite (Informal):: GEM: Gestalt Enhanced Markup Language Model for Web Understanding via Render Tree (Shao et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.375.pdf
Video:: https://aclanthology.org/2023.emnlp-main.375.mp4

PDF Cite Search Video Fix data