MLLM-playground

This is a multi-modal large language model playground, way more than a benchmark.

Overview

MLLM-playground, short for Multimodal Large Language Model Playground, is a toolkit designed to streamline the training and evaluation processes for various vision-and-language datasets using different multimodal large language models. This project offers a unified and user-friendly interface to facilitate the experimentation and development of multimodal models.

Installation

Follow these steps to set up MLLM-playground on your local machine.

Prerequisites

Python >= 3.10

Installation

Clone the repository, and navigate to the project directory.

git clone https://github.com/chu0802/MLLM-playground.git
cd MLLM-playground

Install dependencies:
```
pip install -r requirements.txt
```

Usage

To train and evaluate models using MLLM-playground, two main scripts are provided:

train.py
eval.py

Before running these scripts, it's essential to set up the configuration files appropriately.

Configuration Setup

Training Configuration:
- Open train_config.yaml and adjust the parameters according to your experimental setup. Specify the dataset, model architecture, hyperparameters, and any other relevant settings.
Evaluation Configuration:
- Similarly, in eval_config.yaml, configure the parameters needed for evaluation, such as the path to the trained model, evaluation metrics, etc.

Training

Run the training script using the following command:

python train.py --cfg-path train_config.yaml

We provide basic settings in train_config.yaml. But you can overwrite specific settings in the train_config.yaml file by adding command-line options. For example:

python train.py --cfg-path train_config.yaml --options dataset.name=ScienceQA dataset.split.train.batch_size=16

This command will overwrite the dataset and batch size for training specified in the configuration file.

Evaluation

After training, you can evaluate the model by running the evaluation script:

python eval.py --cfg-path eval_config.yaml

Similarly, ensure that the evaluation configuration in eval_config.yaml is appropriately set up for your experiment, and you can also overwrite the settings by specifying --options arguments.

Monitoring

During training, we can monitor the progress on the WandB dashboard. The trained model will be saved for every epoch according to the settings in the configuration file.

By following these steps, you can efficiently train and evaluate multimodal large language models on various datasets using MLLM-playground. ☺️

Acknowledgments

This code base is partially based on LVLM-eHub [paper, code] and Lavis [paper, code].

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
src		src
test		test
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
eval.py		eval.py
eval_config.yaml		eval_config.yaml
kd_train.py		kd_train.py
requirements.txt		requirements.txt
train.py		train.py
train_config.yaml		train_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLLM-playground

Overview

Installation

Prerequisites

Installation

Usage

Configuration Setup

Training

Evaluation

Monitoring

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

chu0802/MLLM-playground

Folders and files

Latest commit

History

Repository files navigation

MLLM-playground

Overview

Installation

Prerequisites

Installation

Usage

Configuration Setup

Training

Evaluation

Monitoring

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages