Stars
Synthetic Data Generation with Execution-Based Verification and Grounding for LLM Training.
LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management
Scalable RL solution for advanced reasoning of language models
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
Recipes to scale inference-time compute of open models
AIDE: AI-Driven Exploration in the Space of Code. State of the Art machine Learning engineering agents that automates AI R&D.
A minimal GPU design in Verilog to learn how GPUs work from the ground up
🙌 OpenHands: Code Less, Make More
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
A simple, performant and scalable Jax LLM!
A Python toolbox for performing gradient-free optimization
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Instruction Tuning with GPT-4
Simple UI for LLM Model Finetuning
Port of OpenAI's Whisper model in C/C++
Unsupervised text tokenizer focused on computational efficiency
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Code for the paper "Evaluating Large Language Models Trained on Code"
Stable Diffusion web UI