8000 ACkuku (Hal) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View ACkuku's full-sized avatar

Block or report ACkuku

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Simulated experiments for "Real-Time Execution of Action Chunking Flow Policies".

Python 83 1 Updated Jun 8, 2025

所有小初高、大学PDF教材。

Roff 38,925 8,653 Updated May 18, 2025

Online RL with Simple Reward Enables Training VLA Models with Only One Trajectory

Python 210 5 Updated May 30, 2025
Python 16 Updated May 16, 2025

Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Python 213 6 Updated Jun 11, 2025

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

55,465 5,922 Updated Jun 4, 2025

MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU

Python 351 15 Updated Dec 18, 2023

The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"

Python 159 16 Updated Mar 17, 2025

SAM with text prompt

Python 2,232 258 Updated May 10, 2025

🚀 One-stop solution for creating your digital avatar from chat history 💡 Fine-tune LLMs with your chat logs to capture your unique style, then bind to a chatbot to bring your digital self to life. …

Python 13,855 1,031 Updated Jun 14, 2025

Codes of paper "GraspSAM: When Segment Anything Model meets Grasp Detection", ICRA 2025

Python 23 1 Updated Feb 17, 2025

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

888 40 Updated Apr 20, 2025
Python 16 1 Updated May 26, 2025

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 3,253 273 Updated Jun 15, 2025

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Python 2,159 311 Updated Mar 3, 2025

Autoregressive Policy for Robot Learning (RA-L 2025)

Python 120 9 Updated Mar 25, 2025

real time face swap and one-click video deepfake with only a single image

Python 70,982 10,112 Updated Jun 15, 2025

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Python 237 8 Updated Jun 15, 2025

[CoRL 2024] HumanPlus: Humanoid Shadowing and Imitation from Humans

Python 738 113 Updated Jul 1, 2024

[CVPR 25 Highlight & ECCV 24 Workshop Best Paper] RoboTwin Dual-arm Robot Manipulation Simulation Platform

Python 1,020 108 Updated Jun 15, 2025

Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success

Python 460 35 Updated Apr 28, 2025
Python 8 Updated Jun 6, 2025
Python 377 10 Updated Apr 15, 2025

Official code for "Behavior Generation with Latent Actions" (ICML 2024 Spotlight)

Python 172 13 Updated Feb 28, 2024

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Dockerfile 89,288 10,224 Updated Jun 15, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 52,298 6,327 Updated Jun 12, 2025

[TRO 2025] NeuPAN: Direct Point Robot Navigation with End-to-End Model-based Learning.

Python 523 47 Updated Jun 12, 2025

NVIDIA Isaac GR00T N1.5 is the world's first open foundation model for generalized humanoid robot reasoning and skills.

Jupyter Notebook 4,137 554 Updated Jun 11, 2025

Code for the paper: "Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation"

Python 36 3 Updated Mar 22, 2025

[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer

Python 7,878 795 Updated Jun 11, 2025
Next
0