Highlights
- Pro
More
Starred repositories
[CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution'
[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization
About Code release for "Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models" (ICLR 2025)
Enjoy the magic of Diffusion models!
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
[ICLR 2025 Oral] The official implementation of "Diffusion-Based Planning for Autonomous Driving with Flexible Guidance"
Open-Sora: Democratizing Efficient Video Production for All
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Wan: Open and Advanced Large-Scale Video Generative Models
SEED-Voken: A Series of Powerful Visual Tokenizers
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
A differentiable PDE solving framework for machine learning
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
LangGPT: Empowering everyone to become a prompt expert!🚀 Structured Prompt,Language of GPT, 结构化提示词,结构化Prompt, Created by 「云中江树」
The official codebase of ECCV2024 paper: PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
This is an improved generated adversarial network based on evolutionary network.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling