Simplifying metaphorical language for young readers: A corpus study on news text

Magdalena Wolska, Yulia Clausen


Abstract
The paper presents first results of an ongoing project on text simplification focusing on linguistic metaphors. Based on an analysis of a parallel corpus of news text professionally simplified for different grade levels, we identify six types of simplification choices falling into two broad categories: preserving metaphors or dropping them. An annotation study on almost 300 source sentences with metaphors (grade level 12) and their simplified counterparts (grade 4) is conducted. The results show that most metaphors are preserved and when they are dropped, the semantic content tends to be preserved rather than dropped, however, it is reworded without metaphorical language. In general, some of the expected tendencies in complexity reduction, measured with psycholinguistic variables linked to metaphor comprehension, are observed, suggesting good prospect for machine learning-based metaphor simplification.
Anthology ID:
W17-5035
Volume:
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Joel Tetreault, Jill Burstein, Claudia Leacock, Helen Yannakoudakis
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
313–318
Language:
URL:
https://aclanthology.org/W17-5035
DOI:
10.18653/v1/W17-5035
Bibkey:
Cite (ACL):
Magdalena Wolska and Yulia Clausen. 2017. Simplifying metaphorical language for young readers: A corpus study on news text. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 313–318, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Simplifying metaphorical language for young readers: A corpus study on news text (Wolska & Clausen, BEA 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-5035.pdf
Data
Newsela