Dohyung Kim
2024
Can LLMs Recognize Toxicity? A Structured Investigation Framework and Toxicity Metric
Hyukhun Koh
|
Dohyung Kim
|
Minwoo Lee
|
Kyomin Jung
Findings of the Association for Computational Linguistics: EMNLP 2024
In the pursuit of developing Large Language Models (LLMs) that adhere to societal standards, it is imperative to detect the toxicity in the generated text. The majority of existing toxicity metrics rely on encoder models trained on specific toxicity datasets, which are susceptible to out-of-distribution (OOD) problems and depend on the dataset’s definition of toxicity. In this paper, we introduce a robust metric grounded on LLMs to flexibly measure toxicity according to the given definition. We first analyze the toxicity factors, followed by an examination of the intrinsic toxic attributes of LLMs to ascertain their suitability as evaluators. Finally, we evaluate the performance of our metric with detailed analysis. Our empirical results demonstrate outstanding performance in measuring toxicity within verified factors, improving on conventional metrics by 12 points in the F1 score. Our findings also indicate that upstream toxicity significantly influences downstream metrics, suggesting that LLMs are unsuitable for toxicity evaluations within unverified factors.
2022
KoBEST: Korean Balanced Evaluation of Significant Tasks
Myeongjun Jang
|
Dohyung Kim
|
Deuk Sin Kwon
|
Eric Davis
Proceedings of the 29th International Conference on Computational Linguistics
A well-formulated benchmark plays a critical role in spurring advancements in the natural language processing (NLP) field, as it allows objective and precise evaluation of diverse models. As modern language models (LMs) have become more elaborate and sophisticated, more difficult benchmarks that require linguistic knowledge and reasoning have been proposed. However, most of these benchmarks only support English, and great effort is necessary to construct benchmarks for other low resource languages. To this end, we propose a new benchmark named Korean balanced evaluation of significant tasks (KoBEST), which consists of five Korean-language downstream tasks. Professional Korean linguists designed the tasks that require advanced Korean linguistic knowledge. Moreover, our data is purely annotated by humans and thoroughly reviewed to guarantee high data quality. We also provide baseline models and human performance results. Our dataset is available on the Huggingface.
Search
Fix data
Co-authors
- Eric Davis 1
- Myeongjun Jang 1
- Kyomin Jung 1
- Hyukhun Koh 1
- Deuk Sin Kwon 1
- show all...