Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".
-
Updated
Jun 27, 2025 - Python
8000
Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".
A port of muP to JAX/Haiku
Maximal Update Parameterization in Tensorflow
Add a description, image, and links to the mup topic page so that developers can more easily learn about it.
To associate your repository with the mup topic, visit your repo's landing page and select "manage topics."