Esteban Gutiérrez1 and Lonce Wyse1
1 Department of Information and Communications Technologies, Universitat Pompeu Fabra
This repository contains an implementation of all algorithm and models introduced in the thesis titled "Statistics-Driven Texture Sound Synthesis Using Differentiable Digital Signal Processing-Based Architectures" authored by Esteban Gutiérrez and advised by Lonce Wyse at the Universitat Pompeu Fabra.
The thesis explored adapting Differentiable Digital Signal Processing (DDSP) architectures, first introduced by Engel et al. in [1], for synthesizing and controlling texture sounds, which are complex and noisy compared to traditional pitched instrument timbres. It introduces two innovative synthesizers: the
This repository contains a variety of functions, each demonstrated in one or more of the provided tester
Jupyter notebooks.
To train a model, follow these steps:
- Prepare a Configuration File: Create and fill out a JSON configuration file.
- Run Training: Execute the training process using the following command:
python main.py train configuration.json
- Continue Training from Checkpoint: Execute the training process using the following command:
python main.py retrain model_folder
For detailed examples of the training process, refer to the training/wrapper_tester.ipynb
notebook. To see a sample configuration file, check out auxiliar/config_template_pvae.json
.
[1] J. Engel, L. Hantrakul, C. Gu, and A. Roberts, “Ddsp: Differentiable digital signal processing,” in International Conference on Learning Representations, 2020.
[2] N. Saint-Arnaud, “Classification of Sound Textures,” Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA, 1995.
[3] J. H. McDermott and E. P. Simoncelli, “Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis,” Neuron, vol. 71, pp. 926–940, 2011.\