8000 GitHub - xiaohaochen0308/MD-MLLM: Multimodal Classification Modal Decoupling
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

xiaohaochen0308/MD-MLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

MD-MLLM

Disentangled Image-Text Classification

Using MLLM Knowledge to Bridge Visual Representations

This repository contains the official PyTorch implementation of the paper:

“Disentangled Image-Text Classification: Enhancing Visual Representations with MLLM-driven Knowledge Transfer”

Pretrained Checkpoint:

We provide the pretrained checkpoint of MD-MLLM on the N24News Dataset for reproducing the results reported in our paper.

N24News Dataset (Accuracy: 86.08%): Download Checkpoint. Food-101 Dataset (Accuracy: 95.02%): Download Checkpoint.

You can use this checkpoint for evaluation or fine-tuning on related tasks.

Code Availability:

Additional code and resources will be released soon. Stay tuned!

About

Multimodal Classification Modal Decoupling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0