8000 GitHub - kuluhan/GTN: Official implementation of the paper "Generalizing teacher networks for effective knowledge distillation across student architectures" (BMVC'24)
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ GTN Public

Official implementation of the paper "Generalizing teacher networks for effective knowledge distillation across student architectures" (BMVC'24)

Notifications You must be signed in to change notification settings

kuluhan/GTN

Repository files navigation

GTN

built with Python3.10
built with PyTorch1.12
built with PyTorch-lightning1.7

Official implementation of the paper Generalizing teacher networks for effective knowledge distillation across student architectures (BMVC'24)

Authors: Kuluhan Binici, Weiming Wu, Tulika Mitra

[Paper]


Getting Started

Step 1: Generate config.py

Run the Makefile to generate the config.py file:

make config

Step 2: Update Paths

Edit config.py to set the correct paths for your data and model folders:

DATA_ROOT = # PATH/TO/DATA/FOLDER  
MODEL_ROOT = # PATH/TO/MODELS/FOLDER  

Downloading Pre-Trained Teachers

You can download pre-trained teacher model checkpoints from this Google Drive link.


Train Teacher / Student Models from Scratch

To train any network on any dataset, use the following command:

python train-model-no-KD.py --model $NETWORK_NAME --dataset $DATASET_NAME

Alternatively, you can use the provided bash script:

bash scripts/no-KD.sh

Trained models will be saved in the checkpoints/ directory.


KD-Aware Teacher Training

KD-aware teacher training can be done by running the KD-aware-teacher-training.py script.

For GTN Training:

Use the --student supernet argument:

python KD-aware-teacher-training.py --student supernet

Or simply use the pre-made bash script:

bash scripts/gtn.sh

For SFTN Training:

Use the --student isolated_normal argument:

python KD-aware-teacher-training.py --student isolated_normal

Or run the provided bash script:

bash scripts/sftn-teacher.sh

The resulting teacher model checkpoints will be saved in the checkpoints/ directory.


Distilling Student Models Using Pre-Trained Teachers

To distill student models using pre-trained teacher models, run the distill-student.py script with the --kdtrain $KD_METHOD argument, where $KD_METHOD is the name of the distillation method. Options include:

  • DKD
  • SFTN
  • SCKD
  • vanilla

Example:

python distill-student.py --kdtrain DKD

Alternatively, you can use the provided bash scripts located in the scripts/ directory:

  • scripts/DKD.sh
  • scripts/SFTN-kd.sh
  • scripts/SCKD.sh
  • scripts/vanilla-kd.sh

The resulting student model checkpoints will be saved in the checkpoints/ directory.

About

Official implementation of the paper "Generalizing teacher networks for effective knowledge distillation across student architectures" (BMVC'24)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0