Stars
Paper List of Inference/Test Time Scaling/Computing
LBM: Latent Bridge Matching for Fast Image-to-Image Translation ✨
This is an automatic full segmentation tool based on Segment-Anything-2 and Segment-Anything-1. Our tool performs automatic full segmentation of the video, enabling the tracking of each object and …
verl: Volcano Engine Reinforcement Learning for LLMs
SGLang is a fast serving framework for large language models and vision language models.
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Janus-Series: Unified Multimodal Understanding and Generation Models
Agent benchmark for medical diagnosis
⛽️「算法通关手册」:超详细的「算法与数据结构」基础讲解教程,从零基础开始学习算法知识,850+ 道「LeetCode 题目」详细解析,200 道「大厂面试热门题目」。
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
Codebase and tutorial of ContPhy dataset generation for ICML 2024 paper "ContPhy: Continuum Physical Concept Learning and Reasoning from Videos"
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
[ICLR 2024] Thin-shell Object Manipulations with Differentiable Physics Simulations
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Reaching LLaMA2 Performance with 0.1M Dollars
Triton-based implementation of Sparse Mixture of Experts.
[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"
shunzh / mcts-for-llm
Forked from SuReLI/dyna-gymThis is a pip package implementing Reinforcement Learning algorithms in non-stationary environments supported by the OpenAI Gym toolkit.
(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life
A generative world for general-purpose robotics & embodied AI learning.