AudioShield

This is the source code repository for the paper "Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems". The paper has been accepted to the 34th USENIX Security Symposium, 2025.

AudioShield leverages transferable universal adversarial perturbation in latent space (LS-TUAP) to provide real-time speech privacy protection services for users, while meeting the three key requirements: real-time performance, model-agnosticism, and high audio quality. We provide a demo page at https://sites.google.com/view/lstuap.

Setup

Dependencies

To run the code, ensure the following dependencies are installed:

Python == 3.8
PyTorch == 2.2.2
CUDA == 12.2

Install espeak:

apt-get install espeak

Create conda environment:

conda create -n AudioShield python=3.8

Then the required dependencies can be installed by running:

pip install -r requirements.txt

Build Monotonic Alignment Search for VITS model:

cd monotonic_align
python setup.py build_ext --inplace

Pretrained Models

The VITS model is used as the Autoencoder. The pre-trained model can be found here.
DeepSpeech2 is employed as the local target model. The implemented version is available here.

Download the VITS, Deepspeech2 for AudioShield, and ensure they are in the following folders:

pretrained/vits/
pretrained/deepspeech

The paths can be changed for your own, but make sure the paths are consistent with which are set in protection.json.

Dataset

The Librispeech dataset can be downloaded from here. The dev-clean subset is used in this implementation.

Training

Preprocessing

Execute python data_preprocessing.py to process the raw dataset.
Navigate to the datasets folder and run python librispeech.py to process the latent code data.

Training

Return to the main directory and execute python train.py --tgt_text "OPEN THE DOOR" for a quick training session.
Alternatively, the following command allows manual configuration of arguments:

python train.py \
	--training_iters <number_of_iterations> \
	--tau <tau_hyperparameter> \
	--device <device_type> \
	--tgt_text <target_text> \
	--output_dir <output_directory>

Evaluation

After training is complete, use the saved perturbation for evaluation. The ptb_path parameter is the path where the perturbation is stored, and output_dir is the path where the evaluated audio files will be saved.
Run the evaluation script with the following command line:

python eval.py \
	--ptb_path "LS_TUAP.pth" \
	--device "cuda:0" \
	--output_dir "./results" \
	--sampling_rate 16000

Acknowledgement

Part of the implementation is built on VITS and DeepSpeech2. Acknowledgment goes to their outstanding contributions.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Disclaimer

The adversarial examples generated by AudioShield are prohibited from being used for malicious purposes, such as disrupting the normal and legitimate use of ASR systems. Any consequences arising from such misuse are the sole responsibility of the user and are not affiliated with the paper publisher or authors.

Citation

@inproceedings{jin2025whispering,
  author = {Jin, Weifei and Cao, Yuxin and Su, Junjie and Wang, Derui and Zhang, Yedi and Xue, Minhui and Hao, Jie and Dong, Jin Song and Yang, Yixian},
  title = {Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition
5CB0
 Systems},
  booktitle = {34th USENIX Security Symposium (USENIX Security 25)},
  year = {2025},
  address = {Seattle, WA, USA}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
NISQA		NISQA
api		api
configs		configs
datasets		datasets
deepspeech_pytorch		deepspeech_pytorch
monotonic_align		monotonic_align
text		text
LICENSE		LICENSE
LS-TUAP.pth		LS-TUAP.pth
README.md		README.md
attentions.py		attentions.py
commons.py		commons.py
data_preprocessing.py		data_preprocessing.py
download_whisper.py		download_whisper.py
eval.py		eval.py
inference.py		inference.py
mel_processing.py		mel_processing.py
models.py		models.py
modules.py		modules.py
requirements.txt		requirements.txt
train.py		train.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AudioShield

Setup

Dependencies

Pretrained Models

Dataset

Training

Preprocessing

Training

Evaluation

Acknowledgement

License

Disclaimer

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

WeifeiJin/AudioShield

Folders and files

Latest commit

History

Repository files navigation

AudioShield

Setup

Dependencies

Pretrained Models

Dataset

Training

Preprocessing

Training

Evaluation

Acknowledgement

License

Disclaimer

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages