Large Sequence Representation Learning via Multi-Stage Latent Transformers

Ionut-Catalin Sandu; Daniel Voinea; Alin-Ionut Popa

Large Sequence Representation Learning via Multi-Stage Latent Transformers

Ionut-Catalin Sandu, Daniel Voinea, Alin-Ionut Popa

Abstract

We present LANTERN, a multi-stage transformer architecture for named-entity recognition (NER) designed to operate on indefinitely large text sequences (i.e. > 512 elements). For a given image of a form with structured text, our method uses language and spatial features to predict the entity tags of each text element. It breaks the quadratic computational constraints of the attention mechanism by operating over a learned latent space representation which encodes the input sequence via the cross-attention mechanism while having the multi-stage encoding component as a refinement over the NER predictions. As a proxy task, we propose RADAR, an LSTM classifier operating at character level, which predicts the relevance of a word with respect to the entity-recognition task. Additionally, we formulate a challenging novel NER use case, nutritional information extraction from food product labels. We created a dataset with 11,926 images depicting food product labels entitled TREAT dataset, with fully detailed annotations. Our method achieves superior performance against two competitive models designed for long sequences on the proposed TREAT dataset.

Anthology ID:: 2022.coling-1.410
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Editors:: Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 4633–4639
Language:
URL:: https://aclanthology.org/2022.coling-1.410/
DOI:
Bibkey:
Cite (ACL):: Ionut-Catalin Sandu, Daniel Voinea, and Alin-Ionut Popa. 2022. Large Sequence Representation Learning via Multi-Stage Latent Transformers. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4633–4639, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: Large Sequence Representation Learning via Multi-Stage Latent Transformers (Sandu et al., COLING 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.coling-1.410.pdf

PDF Cite Search Fix data