Fatemeh Azadi


2023

pdf bib
PMI-Align: Word Alignment With Point-Wise Mutual Information Without Requiring Parallel Training Data
Fatemeh Azadi | Heshaam Faili | Mohammad Javad Dousti
Findings of the Association for Computational Linguistics: ACL 2023

Word alignment has many applications including cross-lingual annotation projection, bilingual lexicon extraction, and the evaluation or analysis of translation outputs. Recent studies show that using contextualized embeddings from pre-trained multilingual language models could give us high quality word alignments without the need of parallel training data. In this work, we propose PMI-Align which computes and uses the point-wise mutual information between source and target tokens to extract word alignments, instead of the cosine similarity or dot product which is mostly used in recent approaches. Our experiments show that our proposed PMI-Align approach could outperform the rival methods on five out of six language pairs. Although our approach requires no parallel training data, we show that this method could also benefit the approaches using parallel data to fine-tune pre-trained language models on word alignments. Our code and data are publicly available.

pdf bib
Findings of the WMT 2023 Shared Task on Quality Estimation
Frederic Blain | Chrysoula Zerva | Ricardo Rei | Nuno M. Guerreiro | Diptesh Kanojia | José G. C. de Souza | Beatriz Silva | Tânia Vaz | Yan Jingxuan | Fatemeh Azadi | Constantin Orasan | André Martins
Proceedings of the Eighth Conference on Machine Translation

We report the results of the WMT 2023 shared task on Quality Estimation, in which the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels, without access to reference translations. This edition introduces a few novel aspects and extensions that aim to enable more fine-grained, and explainable quality estimation approaches. We introduce an updated quality annotation scheme using Multidimensional Quality Metrics to obtain sentence- and word-level quality scores for three language pairs. We also extend the provided data to new language pairs: we specifically target low-resource languages and provide training, development and test data for English-Hindi, English-Tamil, English-Telegu and English-Gujarati as well as a zero-shot test-set for English-Farsi. Further, we introduce a novel fine-grained error prediction task aspiring to motivate research towards more detailed quality predictions.

2015

pdf bib
Improved search strategy for interactive predictions in computer-assisted translation
Fatemeh Azadi | Shahram Khadivi
Proceedings of Machine Translation Summit XV: Papers

pdf bib
AUT Document Alignment Framework for BUCC Workshop Shared Task
Atefeh Zafarian | Amir Pouya Agha Sadeghi | Fatemeh Azadi | Sonia Ghiasifard | Zeinab Ali Panahloo | Somayeh Bakhshaei | Seyyed Mohammad Mohammadzadeh Ziabary
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora