A Corpus-based Approach for Spanish-Chinese Language Learning

Shuyuan Cao, Iria da Cunha, Mikel Iruskieta


Abstract
Due to the huge population that speaks Spanish and Chinese, these languages occupy an important position in the language learning studies. Although there are some automatic translation systems that benefit the learning of both languages, there is enough space to create resources in order to help language learners. As a quick and effective resource that can give large amount language information, corpus-based learning is becoming more and more popular. In this paper we enrich a Spanish-Chinese parallel corpus automatically with part of-speech (POS) information and manually with discourse segmentation (following the Rhetorical Structure Theory (RST) (Mann and Thompson, 1988)). Two search tools allow the Spanish-Chinese language learners to carry out different queries based on tokens and lemmas. The parallel corpus and the research tools are available to the academic community. We propose some examples to illustrate how learners can use the corpus to learn Spanish and Chinese.
Anthology ID:
W16-4913
Volume:
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Hsin-Hsi Chen, Yuen-Hsien Tseng, Vincent Ng, Xiaofei Lu
Venue:
NLP-TEA
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
97–106
Language:
URL:
https://aclanthology.org/W16-4913
DOI:
Bibkey:
Cite (ACL):
Shuyuan Cao, Iria da Cunha, and Mikel Iruskieta. 2016. A Corpus-based Approach for Spanish-Chinese Language Learning. In Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016), pages 97–106, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
A Corpus-based Approach for Spanish-Chinese Language Learning (Cao et al., NLP-TEA 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-4913.pdf