Predicting Numerals in Natural Language Text Using a Language Model Considering the Quantitative Aspects of Numerals

Taku Sakamoto, Akiko Aizawa


Abstract
Numerical common sense (NCS) is necessary to fully understand natural language text that includes numerals. NCS is knowledge about the numerical features of objects in text, such as size, weight, or color. Existing neural language models treat numerals in a text as string tokens in the same way as other words. Therefore, they cannot reflect the quantitative aspects of numerals in the training process, making it difficult to learn NCS. In this paper, we measure the NCS acquired by existing neural language models using a masked numeral prediction task as an evaluation task. In this task, we use two evaluation metrics to evaluate the language models in terms of the symbolic and quantitative aspects of the numerals, respectively. We also propose methods to reflect not only the symbolic aspect but also the quantitative aspect of numerals in the training of language models, using a loss function that depends on the magnitudes of the numerals and a regression model for the masked numeral prediction task. Finally, we quantitatively evaluate our proposed approaches on four datasets with different properties using the two metrics. Compared with methods that use existing language models, the proposed methods reduce numerical absolute errors, although exact match accuracy was reduced. This result confirms that the proposed methods, which use the magnitudes of the numerals for model training, are an effective way for models to capture NCS.
Anthology ID:
2021.deelio-1.14
Volume:
Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures
Month:
June
Year:
2021
Address:
Online
Editors:
Eneko Agirre, Marianna Apidianaki, Ivan Vulić
Venue:
DeeLIO
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
140–150
Language:
URL:
https://aclanthology.org/2021.deelio-1.14
DOI:
10.18653/v1/2021.deelio-1.14
Bibkey:
Cite (ACL):
Taku Sakamoto and Akiko Aizawa. 2021. Predicting Numerals in Natural Language Text Using a Language Model Considering the Quantitative Aspects of Numerals. In Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 140–150, Online. Association for Computational Linguistics.
Cite (Informal):
Predicting Numerals in Natural Language Text Using a Language Model Considering the Quantitative Aspects of Numerals (Sakamoto & Aizawa, DeeLIO 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.deelio-1.14.pdf
Data
DROP