Martin Graciarena
2022
Threat Scenarios and Best Practices to Detect Neural Fake News
Artidoro Pagnoni
|
Martin Graciarena
|
Yulia Tsvetkov
Proceedings of the 29th International Conference on Computational Linguistics
In this work, we discuss different threat scenarios from neural fake news generated by state-of-the-art language models. Through our experiments, we assess the performance of generated text detection systems under these threat scenarios. For each scenario, we also identify the minimax strategy for the detector that minimizes its worst-case performance. This constitutes a set of best practices that practitioners can rely on. In our analysis, we find that detectors are prone to shortcut learning (lack of out-of-distribution generalization) and discuss approaches to mitigate this problem and improve detectors more broadly. Finally, we argue that strong detectors should be released along with new generators.