AliGATr: Graph-based layout generation for form understanding

Armineh Nourbakhsh, Zhao Jin, Siddharth Parekh, Sameena Shah, Carolyn Rose


Abstract
Forms constitute a large portion of layout-rich documents that convey information through key-value pairs. Form understanding involves two main tasks, namely, the identification of keys and values (a.k.a Key Information Extraction or KIE) and the association of keys to corresponding values (a.k.a. Relation Extraction or RE). State of the art models for form understanding often rely on training paradigms that yield poorly calibrated output probabilities and low performance on RE. In this paper, we present AliGATr, a graph-based model that uses a generative objective to represent complex grid-like layouts that are often found in forms. Using a grid-based graph topology, our model learns to generate the layout of each page token by token in a data efficient manner. Despite using 30% fewer parameters than the smallest SotA, AliGATr performs on par with or better than SotA models on the KIE and RE tasks against four datasets. We also show that AliGATr’s output probabilities are better calibrated and do not exhibit the over-confident distributions of other SotA models.
Anthology ID:
2024.findings-emnlp.778
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13309–13328
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.778/
DOI:
10.18653/v1/2024.findings-emnlp.778
Bibkey:
Cite (ACL):
Armineh Nourbakhsh, Zhao Jin, Siddharth Parekh, Sameena Shah, and Carolyn Rose. 2024. AliGATr: Graph-based layout generation for form understanding. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 13309–13328, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
AliGATr: Graph-based layout generation for form understanding (Nourbakhsh et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.778.pdf
Software:
 2024.findings-emnlp.778.software.zip