Highlights
- Pro
Stars
[CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'
MARS: An Instance-aware, Modular and Realistic Simulator for Autonomous Driving
Open source repo for Locate 3D Model, 3D-JEPA and Locate 3D Dataset
[CVPR 2025] RoomTour3D - Geometry-aware, cheap and automatic data from web videos for embodied navigation
[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
An alternative to label your mocap markers by hand.
[TPAMI 2024] Official repo of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"
Official implementation of "g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks" (CVPR'25).
A new zero-shot framework to explore and search for the language descriptive targets in unknown environment based on Large Vision Language Model.
[CVPR 2025] UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
[NeurIPS 2024] SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
[Actively Maintained🔥] A list of Embodied AI papers accepted by top conferences papers (ICLR, NeurIPS, ICML, RSS, CoRL, ICRA, IROS, CVPR, ICCV, ECCV).
Habitat-based tools for dynamic arrangement and data recording
The repository provides code associated with the paper VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic Navigation (ICRA 2024)