Lists (7)
Sort Name ascending (A-Z)
Stars
PyTorch code and models for VJEPA2 self-supervised learning from video.
[CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
moojink / openvla-oft
Forked from openvla/openvlaFine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
[WACV 2025 Oral] Transferring Foundation Models for Generalizable Robotic Manipulation
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
A comprehensive list of papers about Robot Manipulation, including papers, codes, and related websites.
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Awesome list of papers that extend Mamba to various applications.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
The most reliable AI agent framework that supports MCP.
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
llama3 implementation one matrix multiplication at a time
MiniCPM4: Ultra-Efficient LLMs on End Devices, achieving 5+ speedup on typical end-side chips
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
✨✨Latest Advances on Multimodal Large Language Models
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.