Skip to the content.

TrojanRoom

This is the demo page for TrojanRoom proposed in the paper “Devil in the Room: Triggering Audio Backdoors in the Physical World”.

Abstract

Recent years have witnessed deep learning techniques endowing modern audio systems with powerful abilities. However, the latest studies have revealed its strong reliance on training data raising serious threats from backdoor attacks. Different from most existing works validating the effectiveness of audio backdoors in the digital world, we observe the mismatch between the trigger and backdoor in the physical space by investigating the sound channel distortion. Inspired by this observation, this paper proposes TrojanRoom to bridge the gap between digital and physical audio backdoor attacks. TrojanRoom adopts room impulse response (RIR) as a physical trigger to enable injection-free backdoor activation. By synthesizing dynamic RIRs and poisoning a source class of samples during data augmentation, TrojanRoom allows any adversary to launch an effective and stealthy attack using the specific impulse response in a room. The evaluation shows over 92% and 97% attack success on both state-of-the-art speech command recognition and speaker recognition systems with negligible impact on normal accuracy below 3% at a distance over 5m. The experiments also demonstrate that TrojanRoom could bypass human inspection and voice liveness detection and resist trigger disruption and backdoor erasing.

RIR Trigger

Existing audio backdoor attacks performs trigger injection over the line while ignoring the physical issues. Hence, these attacks degrade in the physical world where the triger is injected over the air. This is due to the sound channel distortion including ambient reverberation and noise, which break the connection between the distorted trigger and implanted backdoor.

over-the-air and over-the-line activation

To bridge the gap between digital and physical audio backdoor attacks, TrojanRoom turns the sound channel itself as a trigger injection path, i.e., channel as a trigger. TrojanRoom models the reverberation as a Room Impulse Response (RIR) and proposes a RIR-based physical trigger to enable an effective, stealthy and injection-free audio backdoor attack in the physical world.

injection-free activation

Baselines Attacks

We compare TrojanRoom with state-of-the-art audio backdoor attacks with different trigger designs:

Here is an example of benign sample (speech command “yes”) and poisoned samples with different triggers: baseline

Audio Samples

We provide the following benign and poisoned audio samples for comparing the stealthiness of different triggers in terms of human perception.

Short Speech Command

Trigger Command "yes" Command "no" Command "right"
benign
freqtone
ultrasound
backnoise
advperturb
rir

Long Speaker Utterance

Trigger Speaker 0 Speaker 1 Speaker 2
benign
freqtone
ultrasound
backnoise
advperturb
rir