Abubakar Auwal Khalid
2026
Leveraging CoHere Multilingual Embeddings and Inverted Softmax Retrieval for Automatic Parallel Sentence Alignment in Low-Resource Languages
Abubakar Auwal Khalid | Salisu Musa Borodo | Amina Abubakar Imam
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
Abubakar Auwal Khalid | Salisu Musa Borodo | Amina Abubakar Imam
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
We present an improved method for automaticparallel sentence alignment in low- resourcelanguages. We used CoHere multilingualembeddings and inverted softmax retrieval.Our technique achieved a higher F1-score of78.30% on the MAFAND-MT test set, comparedto the existing technique’s 54.75%. Precisionand recall have shown similar performance.We assessed the quality of the extracted data bydemonstrating that it outperforms the existingtechnique in terms of low-resource translationperformance.