TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining

Qing Zong, Zhaowei Wang, Baixuan Xu, Tianshi Zheng, Haochen Shi, Weiqi Wang, Yangqiu Song, Ginny Wong, Simon See


Abstract
A main goal of Argument Mining (AM) is to analyze an author’s stance. Unlike previous AM datasets focusing only on text, the shared task at the 10th Workshop on Argument Mining introduces a dataset including both texts and images. Importantly, these images contain both visual elements and optical characters. Our new framework, TILFA (A Unified Framework for Text, Image, and Layout Fusion in Argument Mining), is designed to handle this mixed data. It excels at not only understanding text but also detecting optical characters and recognizing layout details in images. Our model significantly outperforms existing baselines, earning our team, KnowComp, the 1st place in the leaderboard of Argumentative Stance Classification subtask in this shared task.
Anthology ID:
2023.argmining-1.14
Volume:
Proceedings of the 10th Workshop on Argument Mining
Month:
December
Year:
2023
Address:
Singapore
Editors:
Milad Alshomary, Chung-Chi Chen, Smaranda Muresan, Joonsuk Park, Julia Romberg
Venues:
ArgMining | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
139–147
Language:
URL:
https://aclanthology.org/2023.argmining-1.14
DOI:
10.18653/v1/2023.argmining-1.14
Bibkey:
Cite (ACL):
Qing Zong, Zhaowei Wang, Baixuan Xu, Tianshi Zheng, Haochen Shi, Weiqi Wang, Yangqiu Song, Ginny Wong, and Simon See. 2023. TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining. In Proceedings of the 10th Workshop on Argument Mining, pages 139–147, Singapore. Association for Computational Linguistics.
Cite (Informal):
TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining (Zong et al., ArgMining-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.argmining-1.14.pdf