- NTU, Singapore
-
02:19
(UTC +08:00) - https://aroncao49.github.io/
Highlights
- Pro
Stars
[ECCV 2024] Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
Semantic DSP Map in paper "Particle-based Instance-aware Semantic Occupancy Mapping in Dynamic Environments"
Official implementation of paper "Pyramid Diffusion for Fine 3D Large Scene Generation" (ECCV 2024 Oral)
[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer
Learn how to train a quadruped robot to walk using reinforcement learning, from defining actions and observations to designing rewards and transitioning from simulation to reality.
LiMo-Calib: On-Site Fast LiDAR-Motor Calibration for Quadruped Robot-Based Panoramic 3D Sensing System
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
UniDSeg: Unified Cross-Domain 3D Semantic Segmentation via Visual Foundation Models Prior (NeurIPS 2024)
[CVPR 2024 Highlight] XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies
Official Implementation of the ICCV 2023 paper: Perpetual Humanoid Control for Real-time Simulated Avatars
[CVPR24] COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction
[ICLR'25] [3D-LLM] City-scale 3D Visual Grounding with Multi-modality LLMs
[ICLR 2025 Spotlight] Official implementation for "DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes"
KISS-Matcher: Fast, Robust, and Scalable Registration + ROS2 SLAM examples
Mobile manipulation research tools for roboticists
Isaac Gym Environments for Legged Robots
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A suite of image and video neural tokenizers
TensorFlow Implementation for Computing a Semantically Segmented Bird's Eye View (BEV) Image Given the Images of Multiple Vehicle-Mounted Cameras.
A flexible, high-performance 3D simulator for Embodied AI research.
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems