This repository contains the implementation of PIPNet, a robust approach for facial landmark detection using a deep learning model based on ResNet architectures.
PIPNet model has achieved a significant milestone on the 300W dataset, one of the most challenging benchmarks in facial landmark detection. Successfully attained a minimum Normalized Mean Error (NME) of 2.6%, demonstrating the model's high accuracy and robustness in complex facial recognition tasks.
conda create -n PyTorch python=3.8
conda activate PyTorch
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch-lts
pip install opencv-python==4.5.5.64
pip install PyYAML
pip install tqdm
Datasets: 300W
- Download the datasets from official sources.
- Use the convert() function to preprocess the 300W dataset::
$ convert(data_dir="/path/to/300W_dataset", target_size=256)
To train the model, run:
- Configure your dataset path in main.py for training
$ python main.py --train --input-size 256 --batch-size 16 --epochs 60
For testing the model, use:
- Configure your dataset path in main.py for testing
$ python main.py --test
To run the real-time facial landmark detection:
$ python main.py --demo
Backbone | Epochs | Test NME | Pretrained weights |
---|---|---|---|
ResNet18 | 120 | 3.29 | model |