Learning to Respond to Mixed-code Queries using Bilingual Word Embeddings

Chia-Fang Ho, Jason Chang, Jhih-Jie Chen, Chingyu Yang


Abstract
We present a method for learning bilingual word embeddings in order to support second language (L2) learners in finding recurring phrases and example sentences that match mixed-code queries (e.g., “接 受 sentence”) composed of words in both target language and native language (L1). In our approach, mixed-code queries are transformed into target language queries aimed at maximizing the probability of retrieving relevant target language phrases and sentences. The method involves converting a given parallel corpus into mixed-code data, generating word embeddings from mixed-code data, and expanding queries in target languages based on bilingual word embeddings. We present a prototype search engine, x.Linggle, that applies the method to a linguistic search engine for a parallel corpus. Preliminary evaluation on a list of common word-translation shows that the method performs reasonablly well.
Anthology ID:
N19-4005
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–28
Language:
URL:
https://aclanthology.org/N19-4005
DOI:
10.18653/v1/N19-4005
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/N19-4005.pdf