Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers

Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Manabu Okumura, Hiroya Takamura


Abstract
Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. We introduce a new information extraction task, metric-type identification from multi-level header numerical tables, and provide a dataset extracted from scientific papers consisting of header tables, captions, and metric-types. We then propose two joint-learning neural classification and generation schemes featuring pointer-generator-based and BERT-based models. Our results show that the joint models can handle both in-header and out-of-header metric-type identification problems.
Anthology ID:
2021.eacl-main.267
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3062–3071
Language:
URL:
https://aclanthology.org/2021.eacl-main.267
DOI:
10.18653/v1/2021.eacl-main.267
Bibkey:
Cite (ACL):
Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Manabu Okumura, and Hiroya Takamura. 2021. Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3062–3071, Online. Association for Computational Linguistics.
Cite (Informal):
Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers (Suadaa et al., EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-main.267.pdf
Data
Metric-Type of Numerical Tables