Zhao Jin
2024
AliGATr: Graph-based layout generation for form understanding
Armineh Nourbakhsh
|
Zhao Jin
|
Siddharth Parekh
|
Sameena Shah
|
Carolyn Rose
Findings of the Association for Computational Linguistics: EMNLP 2024
Forms constitute a large portion of layout-rich documents that convey information through key-value pairs. Form understanding involves two main tasks, namely, the identification of keys and values (a.k.a Key Information Extraction or KIE) and the association of keys to corresponding values (a.k.a. Relation Extraction or RE). State of the art models for form understanding often rely on training paradigms that yield poorly calibrated output probabilities and low performance on RE. In this paper, we present AliGATr, a graph-based model that uses a generative objective to represent complex grid-like layouts that are often found in forms. Using a grid-based graph topology, our model learns to generate the layout of each page token by token in a data efficient manner. Despite using 30% fewer parameters than the smallest SotA, AliGATr performs on par with or better than SotA models on the KIE and RE tasks against four datasets. We also show that AliGATr’s output probabilities are better calibrated and do not exhibit the over-confident distributions of other SotA models.
2022
Masked Measurement Prediction: Learning to Jointly Predict Quantities and Units from Textual Context
Daniel Spokoyny
|
Ivan Lee
|
Zhao Jin
|
Taylor Berg-Kirkpatrick
Findings of the Association for Computational Linguistics: NAACL 2022
Physical measurements constitute a large portion of numbers in academic papers, engineering reports, and web tables. Current benchmarks fall short of properly evaluating numeracy of pretrained language models on measurements, hindering research on developing new methods and applying them to numerical tasks. To that end, we introduce a novel task, Masked Measurement Prediction (MMP), where a model learns to reconstruct a number together with its associated unit given masked text. MMP is useful for both training new numerically informed models as well as evaluating numeracy of existing systems. To address this task, we introduce a new Generative Masked Measurement (GeMM) model that jointly learns to predict numbers along with their units. We perform fine-grained analyses comparing our model with various ablations and baselines. We use linear probing of traditional pretrained transformer models (RoBERTa) to show that they significantly underperform jointly trained number-unit models, highlighting the difficulty of this new task and the benefits of our proposed pretraining approach. We hope this framework accelerates the progress towards building more robust numerical reasoning systems in the future.
Search
Co-authors
- Armineh Nourbakhsh 1
- Siddharth Parekh 1
- Sameena Shah 1
- Carolyn Rose 1
- Daniel Spokoyny 1
- show all...