Uncertainty Modelling in Under-Represented Languages with Bayesian Deep Gaussian Processes

Ubaid Azam, Imran Razzak, Shelly Vishwakarma, Shoaib Jameel


Abstract
NLP models often face challenges with under-represented languages due to a lack of sufficient training data and language complexities. This can result in inaccurate predictions and a failure to capture the inherent uncertainties within these languages. This paper introduces a new method for modelling uncertainty in under-represented languages by employing deep Bayesian Gaussian Processes. We develop a novel framework that integrates prior knowledge and leverages kernel functions. This helps enable the quantification of uncertainty in predictions to overcome the data limitations in under-represented languages. The efficacy of our approach is validated through various experiments, and the results are benchmarked against existing methods to highlight the enhancements in prediction accuracy and measurement of uncertainty.
Anthology ID:
2025.coling-main.96
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1438–1450
Language:
URL:
https://aclanthology.org/2025.coling-main.96/
DOI:
Bibkey:
Cite (ACL):
Ubaid Azam, Imran Razzak, Shelly Vishwakarma, and Shoaib Jameel. 2025. Uncertainty Modelling in Under-Represented Languages with Bayesian Deep Gaussian Processes. In Proceedings of the 31st International Conference on Computational Linguistics, pages 1438–1450, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Uncertainty Modelling in Under-Represented Languages with Bayesian Deep Gaussian Processes (Azam et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.96.pdf