Hi-Transformer: Hierarchical Interactive Transformer for Efficient and Effective Long Document Modeling

Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang


Abstract
Transformer is important for text modeling. However, it has difficulty in handling long documents due to the quadratic complexity with input text length. In order to handle this problem, we propose a hierarchical interactive Transformer (Hi-Transformer) for efficient and effective long document modeling. Hi-Transformer models documents in a hierarchical way, i.e., first learns sentence representations and then learns document representations. It can effectively reduce the complexity and meanwhile capture global document context in the modeling of each sentence. More specifically, we first use a sentence Transformer to learn the representations of each sentence. Then we use a document Transformer to model the global document context from these sentence representations. Next, we use another sentence Transformer to enhance sentence modeling using the global document context. Finally, we use hierarchical pooling method to obtain document embedding. Extensive experiments on three benchmark datasets validate the efficiency and effectiveness of Hi-Transformer in long document modeling.
Anthology ID:
2021.acl-short.107
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
848–853
Language:
URL:
https://aclanthology.org/2021.acl-short.107
DOI:
10.18653/v1/2021.acl-short.107
Bibkey:
Cite (ACL):
Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2021. Hi-Transformer: Hierarchical Interactive Transformer for Efficient and Effective Long Document Modeling. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 848–853, Online. Association for Computational Linguistics.
Cite (Informal):
Hi-Transformer: Hierarchical Interactive Transformer for Efficient and Effective Long Document Modeling (Wu et al., ACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-short.107.pdf
Video:
 https://aclanthology.org/2021.acl-short.107.mp4
Data
MIND