Starred repositories
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
agilexrobotics / ugv_sdk
Forked from westonrobot/ugv_sdkC++ SDK for Mobile Robot Platforms
Code for the ICLR 2024 spotlight paper: "Learning to Act without Actions" (introducing Latent Action Policies)
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
[IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose (ICCV 2023)
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
A generative and self-guided robotic agent that endlessly propose and master new skills.
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
moojink / openvla-oft
Forked from openvla/openvlaFine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
[CoRL 2024] HumanPlus: Humanoid Shadowing and Imitation from Humans
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
华为安装GMS教程,可使用原生的GMS,该教程为在前人的基础上优化改进,使用了microg。原教程地址在本人个人博客https://toalan.com
The Next Step Forward in Multimodal LLM Alignment
The official repo for [ACM CSUR'24] "Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Opportunities"
This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov
[Lumina Embodied AI Community] 具身智能技术指南 Embodied-AI-Guide
Open source implementation of AlphaFold3