PHI is the implementation for Action Quality Assessment (AQA) based on the paper "PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction". This repository serves as a reproduction of our paper's implementation. Please note that due to the passage of time and changes in hardware, the original code was not preserved.
Our PHI framework addresses the domain shift issue through two crucial processes. Firstly, Gap Minimization Flow (GMF) progressively transforms the initial feature into the desired one, minimizing the domain gap. Secondly, List-wise Contrastive Regularization (LCR) guides the model towards subtle variations in actions, facilitating the transition from coarse to fine-grained features crucial for AQA. Finally, the refined feature is used to predict the quality score through an MLP.
Here are the instructions for obtaining the features and videos for the Rhythmic Gymnastics and Fis-V datasets used in our experiments:
For VST features:
- The VST features and label files of Rhythmic Gymnastics and Fis-V datasets can be download from the GDLT repository.
For I3D features:
- The I3D features and label files for both datasets will be released soon.
For Rhythmic Gymnastics videos:
- Download the videos from the ACTION-NET repository.
For Fis-V videos:
- Download the videos from the MS_LSTM repository.
Please use the above public repositories to obtain the features and videos needed to reproduce our results. Let us know if you need any clarification or have trouble accessing the data.
After downloading the Rhythmic Gymnastics dataset features and videos from the referenced repositories, preprocess the data by using rg_swinx.py
.
# Choose different head to extract features like load_model or load_model_I3d
data_path = '/{project path}/data'
orig_save = '/{project path}/data/swintx_orig_fps25_clip{}'.format(clip_len)
pool_save = '/{project path}/data/swintx_avg_fps25_clip{}'.format(clip_len)
# Command
python rg_swintx.py
To get started, you will need to first clone this project and then install the required dependencies.
- RTX3090
- CUDA: 11.1
- Python: 3.8+
- PyTorch: 1.10.1+cu111
Install the required packages:
pip install -r requirements.txt
This will install all the required packages listed in the requirements.txt
file.
Using the following command to train the model:
CUDA_VISIBLE_DEVICES=${gpu} python main.py \
--video-path ${path}/swintx_avg_fps25_clip32 \
--train-label-path ${path}/train.txt \
--test-label-path ${path}/test.txt \
--model-name phi \
--action-type Ball \
--lr 1e-2 --epoch 200 \
--n_encoder 1 --n_decoder 2 --n_query 4 --alpha 1 --margin 1 --lr-decay cos --decay-rate 1e-2 --dropout 0.3 \
--loss_align 1 --activate-type 2 --n_head 1 --hidden_dim 256 --beta 0.01 --flow_hidden_dim 256
Using the following command to test the model:
CUDA_VISIBLE_DEVICES=${gpu} python main.py \
--video-path ${path}/swintx_avg_fps25_clip32 \
--train-label-path ${path}/train.txt \
--test-label-path ${path}/test.txt \
--model-name phi \
--action-type Ball \
--lr 1e-2 --epoch 200 \
--n_encoder 1 --n_decoder 2 --n_query 4 --alpha 1 --margin 1 --lr-decay cos --decay-rate 1e-2 --dropout 0.3 \
--loss_align 1 --activate-type 2 --n_head 1 --hidden_dim 256 --beta 0.01 --flow_hidden_dim 256 \
--test --ckpt {your model saving path}/best.pkl
We provide a detailed example to reproduce our results on the Ball (RG) dataset. The corresponding bash script, train_vst_rg_ball.sh
, utilizes a two-stage training approach.
🚀 From the training log outputs/phi/Ball/second_phase/log.txt
, we observe that the best Spearman's Rank Correlation Coefficient (SRCC) achieved exceeds the result reported in our paper.
📈 Additionally, you can select a result that balances SRCC and the precision metric (RL2) based on the specific requirements of your application. For example, Epoch 23 closely aligns with or even surpasses the results presented in our paper.
🌟 Be patient and persistent in tuning the code to achieve new state-of-the-art results.
This repository is built upon the CoFInAl framework. If you have any questions or need further assistance with the code, please feel free to reach out. Thank you for your interest and support!