IMPARA: Impact-Based Metric for GEC Using Parallel Data

Koki Maeda; Masahiro Kaneko; Naoaki Okazaki

IMPARA: Impact-Based Metric for GEC Using Parallel Data

Koki Maeda, Masahiro Kaneko, Naoaki Okazaki

Abstract

Automatic evaluation of grammatical error correction (GEC) is essential in developing useful GEC systems. Existing methods for automatic evaluation require multiple reference sentences or manual scores. However, such resources are expensive, thereby hindering automatic evaluation for various domains and correction styles. This paper proposes an Impact-based Metric for GEC using PARAllel data, IMPARA, which utilizes correction impacts computed by parallel data comprising pairs of grammatical/ungrammatical sentences. As parallel data is cheaper than manually assessing evaluation scores, IMPARA can reduce the cost of data creation for automatic evaluation. Correlations between IMPARA and human scores indicate that IMPARA is comparable or better than existing evaluation methods. Furthermore, we find that IMPARA can perform evaluations that fit different domains and correction styles trained on various parallel data.

Anthology ID:: 2022.coling-1.316
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Editors:: Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 3578–3588
Language:
URL:: https://aclanthology.org/2022.coling-1.316/
DOI:
Bibkey:
Cite (ACL):: Koki Maeda, Masahiro Kaneko, and Naoaki Okazaki. 2022. IMPARA: Impact-Based Metric for GEC Using Parallel Data. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3578–3588, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: IMPARA: Impact-Based Metric for GEC Using Parallel Data (Maeda et al., COLING 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.coling-1.316.pdf

PDF Cite Search Fix data