PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction

Accepted by TIP 2025 | arXiv

PHI is the implementation for Action Quality Assessment (AQA) based on the paper "PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction". This repository serves as a reproduction of our paper's implementation. Please note that due to the passage of time and changes in hardware, the original code was not preserved.

Framework

Our PHI framework addresses the domain shift issue through two crucial processes. Firstly, Gap Minimization Flow (GMF) progressively transforms the initial feature into the desired one, minimizing the domain gap. Secondly, List-wise Contrastive Regularization (LCR) guides the model towards subtle variations in actions, facilitating the transition from coarse to fine-grained features crucial for AQA. Finally, the refined feature is used to predict the quality score through an MLP.

Datasets

Here are the instructions for obtaining the features and videos for the Rhythmic Gymnastics and Fis-V datasets used in our experiments:

For VST features:

The VST features and label files of Rhythmic Gymnastics and Fis-V datasets can be download from the GDLT repository.

For I3D features:

The I3D features and label files for both datasets will be released soon.

For Rhythmic Gymnastics videos:

Download the videos from the ACTION-NET repository.

For Fis-V videos:

Download the videos from the MS_LSTM repository.

Please use the above public repositories to obtain the features and videos needed to reproduce our results. Let us know if you need any clarification or have trouble accessing the data.

After downloading the Rhythmic Gymnastics dataset features and videos from the referenced repositories, preprocess the data by using rg_swinx.py.

# Choose different head to extract features like load_model or load_model_I3d
data_path = '/{project path}/data'

orig_save = '/{project path}/data/swintx_orig_fps25_clip{}'.format(clip_len)
pool_save = '/{project path}/data/swintx_avg_fps25_clip{}'.format(clip_len)

# Command
python rg_swintx.py

Installation

To get started, you will need to first clone this project and then install the required dependencies.

Environments

RTX3090
CUDA: 11.1
Python: 3.8+
PyTorch: 1.10.1+cu111

Basic packages

Install the required packages:

pip install -r requirements.txt

This will install all the required packages listed in the requirements.txt file.

Training from scratch

Using the following command to train the model:

CUDA_VISIBLE_DEVICES=${gpu} python main.py \
    --video-path ${path}/swintx_avg_fps25_clip32 \
    --train-label-path ${path}/train.txt \
    --test-label-path ${path}/test.txt  \
    --model-name phi \
    --action-type Ball \
    --lr 1e-2 --epoch 200 \
    --n_encoder 1 --n_decoder 2 --n_query 4 --alpha 1 --margin 1 --lr-decay cos --decay-rate 1e-2 --dropout 0.3 \
    --loss_align 1 --activate-type 2 --n_head 1 --hidden_dim 256 --beta 0.01 --flow_hidden_dim 256

Testing

Using the following command to test the model:

CUDA_VISIBLE_DEVICES=${gpu} python main.py \
    --video-path ${path}/swintx_avg_fps25_clip32 \
    --train-label-path ${path}/train.txt \
    --test-label-path ${path}/test.txt  \
    --model-name phi \
    --action-type Ball \
    --lr 1e-2 --epoch 200 \
    --n_encoder 1 --n_decoder 2 --n_query 4 --alpha 1 --margin 1 --lr-decay cos --decay-rate 1e-2 --dropout 0.3 \
    --loss_align 1 --activate-type 2 --n_head 1 --hidden_dim 256 --beta 0.01 --flow_hidden_dim 256 \
    --test --ckpt {your model saving path}/best.pkl

Reproduction Example

We provide a detailed example to reproduce our results on the Ball (RG) dataset. The corresponding bash script, train_vst_rg_ball.sh, utilizes a two-stage training approach.

🚀 From the training log outputs/phi/Ball/second_phase/log.txt, we observe that the best Spearman's Rank Correlation Coefficient (SRCC) achieved exceeds the result reported in our paper.

📈 Additionally, you can select a result that balances SRCC and the precision metric (RL2) based on the specific requirements of your application. For example, Epoch 23 closely aligns with or even surpasses the results presented in our paper.

🌟 Be patient and persistent in tuning the code to achieve new state-of-the-art results.

Acknowledgements

This repository is built upon the CoFInAl framework. If you have any questions or need further assistance with the code, please feel free to reach out. Thank you for your interest and support!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
outputs/phi/Ball		outputs/phi/Ball
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
datasets.py		datasets.py
logger.py		logger.py
main.py		main.py
options.py		options.py
overview.png		overview.png
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction

Framework

Datasets

Installation

Environments

Basic packages

Training from scratch

Testing

Reproduction Example

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ZhouKanglei/PHI_AQA

Folders and files

Latest commit

History

Repository files navigation

PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction

Framework

Datasets

Installation

Environments

Basic packages

Training from scratch

Testing

Reproduction Example

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages