Stars
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
PyTorch code and models for VJEPA2 self-supervised learning from video.
The official implementation of "Horizon Reduction Makes RL Scalable"
Time Blindness: Why Video-Language Models Can't See What Humans Can?
DSPy: The framework for programming—not prompting—language models
Tool for generating high quality Synthetic datasets
[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning
Lets make video diffusion practical!
Medical SAM 2: Segment Medical Images As Video Via Segment Anything Model 2
Minimal and annotated implementations of key ideas from modern deep learning research.
LightlyTrain is the first PyTorch framework to pretrain computer vision models on unlabeled data for industrial applications
PyTorch code and models for V-JEPA self-supervised learning from video.
CUDA Python: Performance meets Productivity
🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
Train transformer language models with reinforcement learning.
Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
Fully open reproduction of DeepSeek-R1
Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".
allRank is a framework for training learning-to-rank neural models based on PyTorch.
A generative world for general-purpose robotics & embodied AI learning.
Large Concept Models: Language modeling in a sentence representation space
Code for NeurIPS 2024 paper - The GAN is dead; long live the GAN! A Modern Baseline GAN - by Huang et al.