Muhammed Yusuf Kocyigit


pdf bib
Challenges in Measuring Bias via Open-Ended Language Generation
Afra Feyza Akyürek | Muhammed Yusuf Kocyigit | Sejin Paik | Derry Tanti Wijaya
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Researchers have devised numerous ways to quantify social biases vested in pretrained language models. As some language models are capable of generating coherent completions given a set of textual prompts, several prompting datasets have been proposed to measure biases between social groups—posing language generation as a way of identifying biases. In this opinion paper, we analyze how specific choices of prompt sets, metrics, automatic tools and sampling strategies affect bias results. We find out that the practice of measuring biases through text completion is prone to yielding contradicting results under different experiment settings. We additionally provide recommendations for reporting biases in open-ended language generation for a more complete outlook of biases exhibited by a given language model. Code to reproduce the results is released under


pdf bib
NUBIA: NeUral Based Interchangeability Assessor for Text Generation
Hassan Kane | Muhammed Yusuf Kocyigit | Ali Abdalla | Pelkins Ajanoh | Mohamed Coulibali
Proceedings of the 1st Workshop on Evaluating NLG Evaluation

We present NUBIA, a methodology to build automatic evaluation metrics for text generation using only machine learning models as core components. A typical NUBIA model is composed of three modules: a neural feature extractor, an aggregator and a calibrator. We demonstrate an implementation of NUBIA showing competitive performance with stateof-the art metrics used to evaluate machine translation and state-of-the art results for image captions quality evaluation. In addition to strong performance, NUBIA models have the advantage of being modular and improve in synergy with advances in text generation models.