A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.

972 40 Updated Jun 27, 2025

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 15,974 1,879 Updated Dec 25, 2024

wyf3 / llm_related

复现大模型相关算法及一些学习记录

Jupyter Notebook 1,776 259 Updated Jun 14, 2025

openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 29,604 3,650 Updated Jul 23, 2024

GengzeZhou / NavGPT

[AAAI 2024] Official implementation of NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

Python 247 25 Updated Nov 7, 2023

YicongHong / Recurrent-VLN-BERT

Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation

Python 182 34 Updated Aug 13, 2022

leggedrobotics / viplanner

ViPlanner: Visual Semantic Imperative Learning for Local Navigation

Python 499 47 Updated Feb 12, 2025

cnyvfang / labelGo-Yolov5AutoLabelImg

YOLOV5 semi-automatic annotation tool (Based on labelImg)

Python 476 68 Updated Jun 22, 2023

kaylorchen / rk3588-yolo-demo

The project is a multi-threaded inference demo of Yolo running on the RK3588 platform, which has been adapted for reading video files and camera feeds. The demo uses the Yolov8n model for file infe…

C++ 337 50 Updated May 25, 2025

leggedrobotics / darknet_ros

YOLO ROS: Real-Time Object Detection for ROS

C++ 2,343 1,207 Updated Jul 19, 2024

WZMIAOMIAO / deep-learning-for-image-processing

deep learning for image processing including classification and object-detection etc.

Python 25,054 8,191 Updated Jan 12, 2025

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 57,515 7,992 Updated Jun 28, 2025

ollama / ollama

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.

Go 145,101 12,248 Updated Jun 29, 2025

huggingface / blog

Public repo for HF blog posts

Jupyter Notebook 2,998 875 Updated Jun 27, 2025

PhoenixZqh

Lists (18)

3588

CPP PJ

PointCloud

REID

UAV

VLA

VLN

传感器融合

单目标跟踪

反无

多目标跟踪

大唐

大模型

小目标检测

控制

无人机数据集

深度学习教程

跨视图目标跟踪

Stars