Shelly Vishwakarma
2025
Uncertainty Modelling in Under-Represented Languages with Bayesian Deep Gaussian Processes
Ubaid Azam
|
Imran Razzak
|
Shelly Vishwakarma
|
Shoaib Jameel
Proceedings of the 31st International Conference on Computational Linguistics
NLP models often face challenges with under-represented languages due to a lack of sufficient training data and language complexities. This can result in inaccurate predictions and a failure to capture the inherent uncertainties within these languages. This paper introduces a new method for modelling uncertainty in under-represented languages by employing deep Bayesian Gaussian Processes. We develop a novel framework that integrates prior knowledge and leverages kernel functions. This helps enable the quantification of uncertainty in predictions to overcome the data limitations in under-represented languages. The efficacy of our approach is validated through various experiments, and the results are benchmarked against existing methods to highlight the enhancements in prediction accuracy and measurement of uncertainty.