jeasinema

💭

I may be slow to respond.

Xiaojian Ma jeasinema

💭

I may be slow to respond.

251 followers · 107 following

Achievements

x2 x2

Achievements

x2 x2

Highlights

Lists (1)

Sort

Leisure

1 repository

Stars

zjysteven / lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Python 311 37 Updated Feb 25, 2025

changhaonan / A3VLM

[CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`

Python 114 2 Updated Oct 7, 2024

SiyuanHuang95 / ManipVQA

[IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models

Python 97 3 Updated Aug 22, 2024

Embodied-VideoAgent / embodied-videoagent

3 Updated Jun 26, 2025

ControlGenAI / Inverse-and-Edit

Jupyter Notebook 25 1 Updated Jun 25, 2025

FreedomIntelligence / ShareGPT-4o-Image

231 10 Updated Jun 28, 2025

google-deepmind / aloha_sim

A collection of tabletop tasks in Mujoco

Python 198 16 Updated Jun 30, 2025

UWRobotLearning / WheeledLab

Environments, assets, workflow for open-source mobile robotics, integrated with IsaacLab.

Python 147 21 Updated May 22, 2025

cake-lab / ARFlow

Democratizing Augmented Reality research and development.

C# 45 9 Updated Jul 11, 2025

beacon-3d / beacon-3d

[CVPR 2025] Official code repository for Beacon3D benchmark

Python 14 Updated May 15, 2025

Computer-use-agents / MacOS-Agent

A powerful automation agent for macOS that enables natural language control of various system applications and services. This agent allows you to interact with your Mac using simple text commands, …

Python 18 Updated Jun 5, 2025

microsoft / mineworld

MineWorld: A Real-time interactive world model on Minecraft

Python 363 27 Updated Jul 8, 2025

OpenRobotLab / Re3Sim

Official implementation of "Re3Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation"

Jupyter Notebook 102 4 Updated Mar 18, 2025

bdaiinstitute / embodied_gaussians

Python 154 10 Updated Apr 25, 2025

RoboVerseOrg / RoboVerse

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

Python 1,318 96 Updated Jul 14, 2025

Inception3D / Easi3R

[ICCV 2025] A simple training-free approach adapting DUSt3R for dynamic scenes.

Python 425 17 Updated Apr 1, 2025

OmniMMI / M4

[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Python 10 Updated Apr 2, 2025

ahujasid / blender-mcp

Python 12,384 1,144 Updated Jun 21, 2025

facebookresearch / co-tracker

CoTracker is a model for tracking any point (pixel) on a video.

Jupyter Notebook 4,450 308 Updated Jan 21, 2025

roboflow / rf-detr

RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO & designed for fine-tuning.

Python 2,339 260 Updated Jul 10, 2025

CraftJarvis / JarvisVLA

Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"

Python 85 7 Updated May 28, 2025

rongyaofang / GoT

Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"

Jupyter Notebook 266 11 Updated Apr 30, 2025

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 9,757 938 Updated Jul 14, 2025

silence143 / EMMOE

EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments

Python 18 Updated May 15, 2025

embodiedreasoning / ERQA

Embodied Reasoning Question Answer (ERQA) Benchmark

Python 183 8 Updated Mar 12, 2025

openai / openai-cua-sample-app

Learn how to use CUA (our Computer Using Agent) via the API on multiple computer environments.

Python 1,004 295 Updated Apr 24, 2025

FoundationAgents / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 47,946 8,383 Updated Jul 14, 2025

CraftJarvis / ROCKET-2

Official Implementation of Paper "ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment"

Python 39 Updated Jul 2, 2025

graspnet / AnyDexGrasp

Python 140 8 Updated Mar 22, 2025

Multiverse-Framework / Multiverse

Decentralized Simulation Framework designed to integrate multiple advanced physics engines along with various photo-realistic graphics engines to simulate everything

32 5 Updated Jul 2, 2025