llm-analysis

Latency and Memory Analysis of Transformer Models for Training and Inference

Overview

Many formulas or equations are floating around in papers, blogs, etc., about how to calculate training or inference latency and memory for Large Language Models (LLMs) or Transformers. Rather than doing math on papers or typing in Excel sheets, let's automate the boring stuff with llm-analysis ⚙️!

Given the specified model, GPU, data type, and parallelism configurations, llm-analysis estimates the latency and memory usage of LLMs for training or inference. With llm-analysis, one can easily try out different training/inference setups theoretically, and better understand the system performance for different scenarios.

llm-analysis helps answer questions such as:

what batch size, data type, parallelism scheme to use to get a feasible (not getting OOM) and optimal (maximizing throughput with a latency constraint) setup for training or inference
time it takes with the given setup to do training or inference and the cost (GPU-hours)
how the latency/memory changes if using a different model, GPU type, number of GPU, data type for weights and activations, parallelism configuration (suggesting the performance benefit of modeling change, hardware improvement, quantization, parallelism, etc.)

To install this development build:

pip install --upgrade git+https://github.com/strategicalit/llm-analysis.git@main

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
llm_analysis		llm_analysis
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llm-analysis

Overview

About

Uh oh!

Releases

Packages

Languages

License

StrategicalIT/llm-analysis

Folders and files

Latest commit

History

Repository files navigation

llm-analysis

Overview

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages