GPT-Driven Neural Network Generator

_{short alias lmurg}

Overview 📖

This Python-based NNGPT project leverages large language models (LLMs) to automate the creation of neural network architectures, streamlining the design process for machine learning practitioners. It leverages various neural networks from the LEMUR Dataset to fine-tune LLMs and provide insights into potential architectures during the creation of new neural network models.

Create and Activate a Virtual Environment (recommended)

For Linux/Mac:

python3 -m venv .venv
source .venv/bin/activate

For Windows:

python3 -m venv .venv
.venv\Scripts\activate

It is also assumed that CUDA 12.6 is installed. If you have a different version, please replace 'cu126' with the appropriate version number.

Environment for NNGPT Developers

Pip package manager

Prerequisites for mpi4py package:

On Debian/Ubuntu systems, run:

   sudo apt install libmpich-dev    # for MPICH

   sudo apt install libopenmpi-dev  # for Open MPI

On Fedora/RHEL systems, run:

   sudo dnf install mpich-devel     # for MPICH

   sudo dnf install openmpi-devel   # for Open MPI

Create a virtual environment, activate it, and run the following command to install all the project dependencies:

python -m pip install --upgrade pip
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu126

If there are installation problems, install the dependencies from the 'requirements.txt' file one by one.

Update of NN Dataset

Remove an old version and install LEMUR Dataset from GitHub to get the most recent code and statistics updates:

rm -rf db
pip uninstall nn-dataset -y
pip install git+https://github.com/ABrain-One/nn-dataset --upgrade --force --extra-index-url https://download.pytorch.org/whl/cu126

Installing the stable version:

pip install nn-dataset --upgrade --extra-index-url https://download.pytorch.org/whl/cu126

Adding functionality to export data to Excel files and generate plots for analyzing neural network performance:

pip install nn-stat --upgrade --extra-index-url https://download.pytorch.org/whl/cu126

and export/generate:

python -m ab.stat.export

Docker

All versions of this project are compatible with AI Linux and can be seamlessly run within its Docker container.

Installing the latest version of the project from GitHub

docker run --rm -u $(id -u):ab -v $(pwd):/a/mm abrainone/ai-linux bash -c "[ -d nn-gpt ] && git -C nn-gpt pull || git -c advice.detachedHead=false clone --depth 1 https://github.com/ABrain-One/nn-gpt"

Running script

docker run --rm -u $(id -u):ab --shm-size=16G -v $(pwd)/nn-gpt:/a/mm abrainone/ai-linux bash -c "python -m ab.gpt.TuneNNGen_8B"

The recently added dependencies might be missing in the AI Linux. In this case, you can create a container from the Docker image abrainone/ai-linux, install the missing packages (preferably using pip install <package name>), and then create a new image from the container using docker commit <container name> <new image name>. You can use this new image locally or push it to the registry for deployment on the computer cluster.

Usage

Use NNAlter*.py to generate initial modified CV models, specify by argument -e to determine the number of epochs for initial CV model generation.

Use TuneNNGen*.py to perform generation and evaluation of CV model, along with fine-tuning/evaluation of a LLM. The -s flag allows to skip CV model generation for a specified number of epochs.

Pretrained LLM weights

Citation

The original version of this project was created at the Computer Vision Laboratory of the University of Würzburg by the authors mentioned below. If you find this project to be useful for your research, please consider citing our articles for NNGPT framework and hyperparameter tuning:

@article{ABrain.NNGPT,
  title        = {NNGPT: Rethinking AutoML with Large Language Models},
  author       = {Kochnev, Roman and Khalid, Waleed and Uzun, Tolgay Atinc and Zhang, Xi and Dhameliya, Yashkumar Sanjaybhai and Qin, Furui and Ignatov, Dmitry and Timofte, Radu},
  year         = {2025}
}

@article{ABrain.HPGPT,
  title={Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning?},
  author={Kochnev, Roman and Goodarzi, Arash Torabi and Bentyn, Zofia Antonina and Ignatov, Dmitry and Timofte, Radu},
  journal={arXiv preprint arXiv:2504.06006},
  year={2025}
}

Licenses

This project is distributed under the following licensing terms:

models with pretrained weights under the legacy DeepSeek LLM V2 license
all neural network models and their weights not covered by the above licenses, as well as all other files and assets in this project, are subject to the MIT license

Name		Name	Last commit message	Last commit date
Latest commit History 302 Commits
ab		ab
CODE-OF-CONDUCT.md		CODE-OF-CONDUCT.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
version		version

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPT-Driven Neural Network Generator

Overview 📖

Create and Activate a Virtual Environment (recommended)

Environment for NNGPT Developers

Pip package manager

Prerequisites for mpi4py package:

Update of NN Dataset

Docker

Usage

Citation

Licenses

The idea and leadership of Dr. Ignatov

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 9

Uh oh!

Languages

License

ABrain-One/nn-gpt

Folders and files

Latest commit

History

Repository files navigation

GPT-Driven Neural Network Generator

Overview 📖

Create and Activate a Virtual Environment (recommended)

Environment for NNGPT Developers

Pip package manager

Prerequisites for mpi4py package:

Update of NN Dataset

Docker

Usage

Citation

Licenses

The idea and leadership of Dr. Ignatov

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 9

Uh oh!

Languages

Packages