8000 GitHub - WeifeiJin/AudioShield: [USENIX Security '25] Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[USENIX Security '25] Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems

License

Notifications You must be signed in to change notification settings

WeifeiJin/AudioShield

Repository files navigation

AudioShield

This is the source code repository for the paper "Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems". The paper has been accepted to the 34th USENIX Security Symposium, 2025.

AudioShield leverages transferable universal adversarial perturbation in latent space (LS-TUAP) to provide real-time speech privacy protection services for users, while meeting the three key requirements: real-time performance, model-agnosticism, and high audio quality. We provide a demo page at https://sites.google.com/view/lstuap.

Setup

Dependencies

To run the code, ensure the following dependencies are installed:

  • Python == 3.8
  • PyTorch == 2.2.2
  • CUDA == 12.2

Install espeak:

apt-get install espeak

Create conda environment:

conda create -n AudioShield python=3.8

Then the required dependencies can be installed by running:

pip install -r requirements.txt

Build Monotonic Alignment Search for VITS model:

cd monotonic_align
python setup.py build_ext --inplace

Pretrained Models

  • The VITS model is used as the Autoencoder. The pre-trained model can be found here.
  • DeepSpeech2 is employed as the local target model. The implemented version is available here.

Download the VITS, Deepspeech2 for AudioShield, and ensure they are in the following folders:

  • pretrained/vits/
  • pretrained/deepspeech

The paths can be changed for your own, but make sure the paths are consistent with which are set in protection.json.

Dataset

The Librispeech dataset can be downloaded from here. The dev-clean subset is used in this implementation.

Training

Preprocessing

  • Execute python data_preprocessing.py to process the raw dataset.
  • Navigate to the datasets folder and run python librispeech.py to process the latent code data.

Training

  • Return to the main directory and execute python train.py --tgt_text "OPEN THE DOOR" for a quick training session.
  • Alternatively, the following command allows manual configuration of arguments:
python train.py \
	--training_iters <number_of_iterations> \
	--tau <tau_hyperparameter> \
	--device <device_type> \
	--tgt_text <target_text> \
	--output_dir <output_directory>

Evaluation

  • After training is complete, use the saved perturbation for evaluation. The ptb_path parameter is the path where the perturbation is stored, and output_dir is the path where the evaluated audio files will be saved.
  • Run the evaluation script with the following command line:
python eval.py \
	--ptb_path "LS_TUAP.pth" \
	--device "cuda:0" \
	--output_dir "./results" \
	--sampling_rate 16000

Acknowledgement

Part of the implementation is built on VITS and DeepSpeech2. Acknowledgment goes to their outstanding contributions.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Disclaimer

The adversarial examples generated by AudioShield are prohibited from being used for malicious purposes, such as disrupting the normal and legitimate use of ASR systems. Any consequences arising from such misuse are the sole responsibility of the user and are not affiliated with the paper publisher or authors.

Citation

@inproceedings{jin2025whispering,
  author = {Jin, Weifei and Cao, Yuxin and Su, Junjie and Wang, Derui and Zhang, Yedi and Xue, Minhui and Hao, Jie and Dong, Jin Song and Yang, Yixian},
  title = {Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition
5CB0
 Systems},
  booktitle = {34th USENIX Security Symposium (USENIX Security 25)},
  year = {2025},
  address = {Seattle, WA, USA}
}

About

[USENIX Security '25] Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0