Taku Sakamoto
2023
Predicting Numerals in Text Using Nearest Neighbor Language Models
Taku Sakamoto
|
Akiko Aizawa
Findings of the Association for Computational Linguistics: ACL 2023
Commonsense about quantitative properties is essential for a deep understanding of texts containing numerals. However, naive language models (LMs) treat numerals as string tokens; therefore, they lack an understanding of the magnitudes of numerals, resulting in a difficulty in acquiring the commonsense. In this study, we apply the k-nearest neighbor LM (kNN-LM) to the masked numeral prediction (MNP) task, which measures the quantitative commonsense of LMs.kNN-LM extends pre-trained neural LMs with the k-nearest neighbor (kNN) search.Since it can utilize patterns that appear in the datastore for prediction, we expect an improvement in numeral prediction accuracy, which is associated with a high rate of occurrence of out-of-vocabulary (OOV) words.Through experiments, we verified that the retrieval-based method is effective for fine-grained predictions of numerals from context, especially for the OOV numerals.We also compared two different context spans for context representations to improve the accuracy of kNN search by using only the words that are closely related to the masked numeral: the mask and its surrounding words, and the mask and its subsequent words.Our results reveal that using only the embeddings of mask tokens for numerals in kNN search is the most effective approach for realizing MNP tasks.
2021
Predicting Numerals in Natural Language Text Using a Language Model Considering the Quantitative Aspects of Numerals
Taku Sakamoto
|
Akiko Aizawa
Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures
Numerical common sense (NCS) is necessary to fully understand natural language text that includes numerals. NCS is knowledge about the numerical features of objects in text, such as size, weight, or color. Existing neural language models treat numerals in a text as string tokens in the same way as other words. Therefore, they cannot reflect the quantitative aspects of numerals in the training process, making it difficult to learn NCS. In this paper, we measure the NCS acquired by existing neural language models using a masked numeral prediction task as an evaluation task. In this task, we use two evaluation metrics to evaluate the language models in terms of the symbolic and quantitative aspects of the numerals, respectively. We also propose methods to reflect not only the symbolic aspect but also the quantitative aspect of numerals in the training of language models, using a loss function that depends on the magnitudes of the numerals and a regression model for the masked numeral prediction task. Finally, we quantitatively evaluate our proposed approaches on four datasets with different properties using the two metrics. Compared with methods that use existing language models, the proposed methods reduce numerical absolute errors, although exact match accuracy was reduced. This result confirms that the proposed methods, which use the magnitudes of the numerals for model training, are an effective way for models to capture NCS.