Sequential Randomized Smoothing for Adversarially Robust Speech Recognition

Raphael Olivier, Bhiksha Raj


Abstract
While Automatic Speech Recognition has been shown to be vulnerable to adversarial attacks, defenses against these attacks are still lagging. Existing, naive defenses can be partially broken with an adaptive attack. In classification tasks, the Randomized Smoothing paradigm has been shown to be effective at defending models. However, it is difficult to apply this paradigm to ASR tasks, due to their complexity and the sequential nature of their outputs. Our paper overcomes some of these challenges by leveraging speech-specific tools like enhancement and ROVER voting to design an ASR model that is robust to perturbations. We apply adaptive versions of state-of-the-art attacks, such as the Imperceptible ASR attack, to our model, and show that our strongest defense is robust to all attacks that use inaudible noise, and can only be broken with very high distortion.
Anthology ID:
2021.emnlp-main.514
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6372–6386
Language:
URL:
https://aclanthology.org/2021.emnlp-main.514
DOI:
10.18653/v1/2021.emnlp-main.514
Bibkey:
Cite (ACL):
Raphael Olivier and Bhiksha Raj. 2021. Sequential Randomized Smoothing for Adversarially Robust Speech Recognition. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6372–6386, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Sequential Randomized Smoothing for Adversarially Robust Speech Recognition (Olivier & Raj, EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.514.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.514.mp4
Code
 raphaelolivier/smoothingasr
Data
LibriSpeech