Bag-of-Words vs. Graph vs. Sequence in Text Classification: Questioning the Necessity of Text-Graphs and the Surprising Strength of a Wide MLP

Lukas Galke, Ansgar Scherp


Abstract
Graph neural networks have triggered a resurgence of graph-based text classification methods, defining today’s state of the art. We show that a wide multi-layer perceptron (MLP) using a Bag-of-Words (BoW) outperforms the recent graph-based models TextGCN and HeteGCN in an inductive text classification setting and is comparable with HyperGAT. Moreover, we fine-tune a sequence-based BERT and a lightweight DistilBERT model, which both outperform all state-of-the-art models. These results question the importance of synthetic graphs used in modern text classifiers. In terms of efficiency, DistilBERT is still twice as large as our BoW-based wide MLP, while graph-based models like TextGCN require setting up an 𝒪(N2) graph, where N is the vocabulary plus corpus size. Finally, since Transformers need to compute 𝒪(L2) attention weights with sequence length L, the MLP models show higher training and inference speeds on datasets with long sequences.
Anthology ID:
2022.acl-long.279
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4038–4051
Language:
URL:
https://aclanthology.org/2022.acl-long.279
DOI:
10.18653/v1/2022.acl-long.279
Bibkey:
Cite (ACL):
Lukas Galke and Ansgar Scherp. 2022. Bag-of-Words vs. Graph vs. Sequence in Text Classification: Questioning the Necessity of Text-Graphs and the Surprising Strength of a Wide MLP. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4038–4051, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Bag-of-Words vs. Graph vs. Sequence in Text Classification: Questioning the Necessity of Text-Graphs and the Surprising Strength of a Wide MLP (Galke & Scherp, ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.279.pdf
Software:
 2022.acl-long.279.software.zip
Code
 lgalke/text-clf-baselines +  additional community code