Stars
[ArXiv 2025] Pseudo-Simulation for Autonomous Driving; [NeurIPS 2024] NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking
Differentiable IoU of rotated bounding boxes using Pytorch
Emu Series: Generative Multimodal Models from BAAI
[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Hackable and optimized Transformers building blocks, supporting a composable construction.
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
🦜🔗 Build context-aware reasoning applications
Fast and memory-efficient exact attention
A curated list of world models for autonomous driving. Keep updated.
Mastering Diverse Domains through World Models
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
Making large AI models cheaper, faster and more accessible
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
On the Road with GPT-4V(ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent