The Validation of MRCPD Cross-language Expansions on Imageability Ratings
Ting Liu | Kit Cho | Tomek Strzalkowski | Samira Shaikh | Mehrdad Mirzaei
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
In this article, we present a method to validate a multi-lingual (English, Spanish, Russian, and Farsi) corpus on imageability ratings automatically expanded from MRCPD (Liu et al., 2014). We employed the corpus (Brysbaert et al., 2014) on concreteness ratings for our English MRCPD+ validation because of lacking human assessed imageability ratings and high correlation between concreteness ratings and imageability ratings (e.g. r = .83). For the same reason, we built a small corpus with human imageability assessment for the other language corpus validation. The results show that the automatically expanded imageability ratings are highly correlated with human assessment in all four languages, which demonstrate our automatic expansion method is valid and robust. We believe these new resources can be of significant interest to the research community, particularly in natural language processing and computational sociolinguistics.