Andrew Gilman


2021

pdf bib
Mutual-Learning Improves End-to-End Speech Translation
Jiawei Zhao | Wei Luo | Boxing Chen | Andrew Gilman
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

A currently popular research area in end-to-end speech translation is the use of knowledge distillation from a machine translation (MT) task to improve the speech translation (ST) task. However, such scenario obviously only allows one way transfer, which is limited by the performance of the teacher model. Therefore, We hypothesis that the knowledge distillation-based approaches are sub-optimal. In this paper, we propose an alternative–a trainable mutual-learning scenario, where the MT and the ST models are collaboratively trained and are considered as peers, rather than teacher/student. This allows us to improve the performance of end-to-end ST more effectively than with a teacher-student paradigm. As a side benefit, performance of the MT model also improves. Experimental results show that in our mutual-learning scenario, models can effectively utilise the auxiliary information from peer models and achieve compelling results on Must-C dataset.

2020

pdf bib
Non-Linearity in Mapping Based Cross-Lingual Word Embeddings
Jiawei Zhao | Andrew Gilman
Proceedings of the Twelfth Language Resources and Evaluation Conference

Recent works on cross-lingual word embeddings have been mainly focused on linear-mapping-based approaches, where pre-trained word embeddings are mapped into a shared vector space using a linear transformation. However, there is a limitation in such approaches–they follow a key assumption: words with similar meanings share similar geometric arrangements between their monolingual word embeddings, which suggest that there is a linear relationship between languages. However, such assumption may not hold for all language pairs across all semantic concepts. We investigate whether non-linear mappings can better describe the relationship between different languages by utilising kernel Canonical Correlation Analysis (KCCA). Experimental results on five language pairs show an improvement over current state-of-art results in both supervised and self-learning scenarios, confirming that non-linear mapping is a better way to describe the relationship between languages.