Alicia Y. Tsai
2024
PG-Story: Taxonomy, Dataset, and Evaluation for Ensuring Child-Safe Content for Story Generation
Alicia Y. Tsai
|
Shereen Oraby
|
Anjali Narayan-Chen
|
Alessandra Cervone
|
Spandana Gella
|
Apurv Verma
|
Tagyoung Chung
|
Jing Huang
|
Nanyun Peng
Proceedings of the Third Workshop on NLP for Positive Impact
Creating children’s stories through text generation is a creative task that requires stories to be both entertaining and suitable for young audiences. However, since current story generation systems often rely on pre-trained language models fine-tuned with limited story data, they may not always prioritize child-friendliness. This can lead to the unintended generation of stories containing problematic elements such as violence, profanity, and biases. Regrettably, despite the significance of these concerns, there is a lack of clear guidelines and benchmark datasets for ensuring content safety for children. In this paper, we introduce a taxonomy specifically tailored to assess content safety in text, with a strong emphasis on children’s well-being. We present PG-Story, a dataset that includes detailed annotations for both sentence-level and discourse-level safety. We demonstrate the potential of identifying unsafe content through self-diagnosis and employing controllable generation techniques during the decoding phase to minimize unsafe elements in generated stories.
Search
Co-authors
- Shereen Oraby 1
- Anjali Narayan-Chen 1
- Alessandra Cervone 1
- Spandana Gella 1
- Apurv Verma 1
- show all...