Source-Filter HiFi-GAN (SiFi-GAN)

This repo provides official PyTorch implementation of SiFi-GAN, a fast and pitch controllable high-fidelity neural vocoder.
This repo also provides code and instruction to serve this model via Nvidia-Triton server

How to serve model via Nvidia-Triton

Environment setup

Install python dependencies.

cd SiFiGAN
pip install -e .

Install docker https://docs.docker.com/engine/install/
Install nvidia container-toolkit https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

Prepare model for serving

Download pretrained checkpoint.

At this example we will create new folder checkpoints at sifigan/nv_triton/misc and download here pretrained checkpoint.

cd sifigan/nv_triton/misc
mkdir checkpoints
cd checkpoints
# download model from dropbox
wget -O checkpoint.pkl https://www.dropbox.com/s/w3pnnmpsxvqfykx/checkpoint-1000000steps.pkl?dl=0
# go back to sifigan/nv_triton/misc dir
cd ..

Prepare pretrained checkpoint for serving

You can prepare any type of model from this section and run serving for this model.
You should run following scripts from sifigan/nv_triton/misc dir.

Prepare JIT FP32 model

python3 model_to_jit.py checkpoints/checkpoint.pkl ../server/model-repository/sifigan-pt-fp32/1/model.pt

checkpoints/checkpoint.pkl is input checkpoint path
../server/model-repository/sifigan-pt-fp32/1/model.pt is output path for compiled jit FP32 model

Prepare JIT FP16 model

python3 model_to_jit.py checkpoints/checkpoint.pkl ../server/model-repository/sifigan-pt-fp16/1/model.pt --fp16=true

checkpoints/checkpoint.pkl is input checkpoint path
../server/model-repository/sifigan-pt-fp16/1/model.pt is output path for compiled jit FP16 model
--fp16=true indicates that we need model in FP16 precision

Prepare ONNX FP32 model

For ONNX generation you need example of real input. You can download it from Dropbox.

wget -O checkpoints/test_tensor.pth https://www.dropbox.com/s/8ccrv26a2t8fed9/test_tensor.pth?dl=0

Run onnx generation

python3 model_to_onnx.py checkpoints/checkpoint.pkl ../server/model-repository/sifigan-onnx-fp32/1/model.onnx \
checkpoints/test_tensor.pth

checkpoints/checkpoint.pkl is input checkpoint path
../server/model-repository/sifigan-onnx-fp32/1/model.onnx is output path for onnx FP32 model checkpoints/test_tensor.pth - path to the example input for the network

Prepare ONNX FP16 model

python3 model_to_onnx.py checkpoints/checkpoint.pkl ../server/model-repository/sifigan-onnx-fp16/1/model.onnx \
checkpoints/test_tensor.pth --fp16=true

checkpoints/checkpoint.pkl is input checkpoint path
../server/model-repository/sifigan-onnx-fp16/1/model.onnx is output path for onnx FP16 model
checkpoints/test_tensor.pth - path to the example input for the network
--fp16=true indicates that we need model in FP16 precision

WARNING! Section related to TensorRT generation are completely optional.
Currently decoding via TensorRT models is not supported, because TensorRT works only with static shapes. But TensorRT models could be used for performance analyze.
IMPORTANT! In order to generate TensorRT models you have to install TensorRT.

Preparation step

First of all copy model configs for trt models to nvidia-triton model-repository

cp -r trt-models-conifgs/* ../server/model-repository/

IMPORTANT! If you stumbled upon an error during generation of TensorRT models, and after that you do not plan to use these models then clean the directory:

rm -rf ../server/model-repository/sifigan-trt-fp32
rm -rf ../server/model-repository/sifigan-trt-fp16

Prepare TensorRT FP32 model

# generate onnx model with static shape
python3 model_to_onnx.py checkpoints/checkpoint.pkl checkpoints/model.onnx checkpoints/test_tensor.pth --use_dynamic_shape=false
# generate TensorRT plan from onnx and put it in right place
python3 onnx_to_tensorrt.py checkpoints/model.onnx ../server/model-repository/sifigan-trt-fp32/1/model.plan

Prepare TensorRT FP16 model

# generate onnx model with static shape in FP16 mode
python3 model_to_onnx.py checkpoints/checkpoint.pkl checkpoints/model.onnx checkpoints/test_tensor.pth --use_dynamic_shape=false --fp16=true
# generate TensorRT plan from onnx in FP16 mode and put it in right place
python3 onnx_to_tensorrt.py checkpoints/model.onnx ../server/model-repository/sifigan-trt-fp16/1/model.plan --fp16=true

Run Nvidia-Triton inference server

To run Nvidia-Triton inference server you should go back to root directory of the repo and run following command

sudo docker run --gpus=1 --rm --net=host -p8000:8000 -p8001:8001 -p8002:8002 \
-v ${PWD}/sifigan/nv_triton/server/model-repository:/models \
nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models

You should see following three lines at the end of the previous command output.

Started GRPCInferenceService at 0.0.0.0:8001
Started HTTPService at 0.0.0.0:8000
Started Metrics Service at 0.0.0.0:8002

This output means that nvidia-triton server is running and you can make inference requests to it.

Run inference requests to Nvidia-Triton server

Left terminal with server running open, and open new terminal for client requests.
First of all you need to prepare input data. In following example we will download input data from dropbox and place it under sifigan/nv_triton/client/data/in folder. You could place your data where you want and just correct paths from examples.

Prepare input data

# assume that we are in root folder of the repo and go to sifigan/nv_triton/client dir
cd sifigan/nv_triton/client
mkdir -p data/in
cd data/in
# download example input data from Dropbox
wget -O example_input_data.tar https://www.dropbox.com/s/qt3jkh2r3fzuge2/example_input_data.tar?dl=0
# unpack it
tar -xvf example_input_data.tar
# delete tar file
rm example_input_data.tar
# go back to sifigan/nv_triton/client dir
cd ../..

Extract all needed features

To prepare your sound files for inference you have to use extract_features.py script.

python3 extract_features.py --input_dir=data/in --output_dir=data/features

It will take all files from --input_dir preprocess them and save in proper format in --output_dir
Extracting features process is based on bunch of hyperparams, they should be located in config .yaml file. By default, extract_features.py uses extract_features_default.yaml config file. You can modify it or even create your own config file and pass path to it via --path_to_config argument.

Run vocoding

To run vocoding from features extracted on previous step, using Sifi GAN you should use decode.py script.

python3 decode.py --input_dir=data/features --output_dir=data/out --model=sifigan-pt-fp32

It will take all extracted features from --input_dir do vocoding for each of them and save resulting sound files into --output_dir.
--model indicates to which model you are making inference request. You can only use models that you have generated at Prepare pretrained checkpoint for serving step. If you have generated all type models from that step than following options of values of this param is open for you:

sifigan-pt-fp32
sifigan-onnx-fp32
sifigan-pt-fp16
sifigan-onnx-fp16

Vocoding process is also based on bunch of hyperparams, they should be located in config .yaml file. By default, decode.py uses decode_default.yaml config file. You can modify it or even create your own config file and pass path to it via --path_to_config argument.

Verifying results

After previous step there should be output sound files in --output_dir directory.
You can listen to them and verify if everything is fine.

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
egs		egs
sifigan		sifigan
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
performance_measure.md		performance_measure.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Source-Filter HiFi-GAN (SiFi-GAN)

How to serve model via Nvidia-Triton

Environment setup

Prepare model for serving

Download pretrained checkpoint.

Prepare pretrained checkpoint for serving

Prepare JIT FP32 model

Prepare JIT FP16 model

Prepare ONNX FP32 model

Prepare ONNX FP16 model

Preparation step

Prepare TensorRT FP32 model

Prepare TensorRT FP16 model

Run Nvidia-Triton inference server

Run inference requests to Nvidia-Triton server

Prepare input data

Extract all needed features

Run vocoding

Verifying results

About

Uh oh!

Releases 2

Packages

Languages

License

VAhafonov/SiFiGAN

Folders and files

Latest commit

History

Repository files navigation

Source-Filter HiFi-GAN (SiFi-GAN)

How to serve model via Nvidia-Triton

Environment setup

Prepare model for serving

Download pretrained checkpoint.

Prepare pretrained checkpoint for serving

Prepare JIT FP32 model

Prepare JIT FP16 model

Prepare ONNX FP32 model

Prepare ONNX FP16 model

Preparation step

Prepare TensorRT FP32 model

Prepare TensorRT FP16 model

Run Nvidia-Triton inference server

Run inference requests to Nvidia-Triton server

Prepare input data

Extract all needed features

Run vocoding

Verifying results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages