Identify Bilingual Patterns and Phrases from a Bilingual Sentence Pair

Yi-Jyun Chen, Hsin-Yun Chung, Jason S. Chang


Abstract
This paper presents a method for automatically identifying bilingual grammar patterns and extracting bilingual phrase instances from a given English-Chinese sentence pair. In our approach, the English-Chinese sentence pair is parsed to identify English grammar patterns and Chinese counterparts. The method involves generating translations of each English grammar pattern and calculating translation probability of words from a word-aligned parallel corpora. The results allow us to extract the most probable English-Chinese phrase pairs in the sentence pair. We present a prototype system that applies the method to extract grammar patterns and phrases in parallel sentences. An evaluation on randomly selected examples from a dictionary shows that our approach has reasonably good performance. We use human judge to assess the bilingual phrases generated by our approach. The results have potential to assist language learning and machine translation research.
Anthology ID:
2021.rocling-1.43
Volume:
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)
Month:
October
Year:
2021
Address:
Taoyuan, Taiwan
Venue:
ROCLING
SIG:
Publisher:
The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Note:
Pages:
333–338
Language:
URL:
https://aclanthology.org/2021.rocling-1.43
DOI:
Bibkey:
Cite (ACL):
Yi-Jyun Chen, Hsin-Yun Chung, and Jason S. Chang. 2021. Identify Bilingual Patterns and Phrases from a Bilingual Sentence Pair. In Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021), pages 333–338, Taoyuan, Taiwan. The Association for Computational Linguistics and Chinese Language Processing (ACLCLP).
Cite (Informal):
Identify Bilingual Patterns and Phrases from a Bilingual Sentence Pair (Chen et al., ROCLING 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.rocling-1.43.pdf