The universal framework for AI-native generalist robotics
Dimensional is an open-source framework for building agentive generalist robots. DimOS allows off-the-shelf Agents to call tools/functions and read sensor/state data directly from ROS.
The framework enables neurosymbolic orchestration of Agents as generalized spatial reasoners/planners and Robot state/action primitives as functions.
The result: cross-embodied "Dimensional Applications" exceptional at generalization and robust at symbolic action execution.
We are shipping a first look at the DIMOS x Unitree Go2 integration, allowing for off-the-shelf Agents() to "call" Unitree ROS2 Nodes and WebRTC action primitives, including:
- Navigation control primitives (move, reverse, spinLeft, spinRight, etc.)
- WebRTC control primitives (FrontPounce, FrontFlip, FrontJump, etc.)
- Camera feeds (image_raw, compressed_image, etc.)
- IMU data
- State information
- Lidar / PointCloud primitives
- Any other generic Unitree ROS2 topics
-
DimOS Agents
- Agent() classes with planning, spatial reasoning, and Robot.Skill() function calling abilities.
- Integrate with any off-the-shelf hosted or local model: OpenAIAgent, ClaudeAgent, GeminiAgent 🚧, DeepSeekAgent 🚧, HuggingFaceRemoteAgent, HuggingFaceLocalAgent, etc.
- Modular agent architecture for easy extensibility and chaining of Agent output --> Subagents input.
- Agent spatial / language memory for location grounded reasoning and recall.
-
DimOS Infrastructure
- A reactive data streaming architecture using RxPY to manage real-time video (or other sensor input), outbound commands, and inbound robot state between the DimOS interface, Agents, and ROS2.
- Robot Command Queue to handle complex multi-step actions to Robot.
- Simulation bindings (Genesis, Isaacsim, etc.) to test your agentive application before deploying to a physical robot.
-
DimOS Interface / Development Tools
- Local development interface to control your robot, orchestrate agents, visualize camera/lidar streams, and debug your dimensional agentive application.
⚠️ Recommended to start
- Docker and Docker Compose installed
- A Unitree Go2 robot accessible on your network
- The robot's IP address
- OpenAI API Key
Configure your environment variables in .env
OPENAI_API_KEY=<OPENAI_API_KEY>
ALIBABA_API_KEY=<ALIBABA_API_KEY>
ANTHROPIC_API_KEY=<ANTHROPIC_API_KEY>
ROBOT_IP=<ROBOT_IP>
CONN_TYPE=webrtc
WEBRTC_SERVER_HOST=0.0.0.0
WEBRTC_SERVER_PORT=9991
DISPLAY=:0
xhost +local:root # If running locally and desire RVIZ GUI
docker compose -f docker/unitree/agents_interface/docker-compose.yml up --build
Interface will start at http://localhost:3000
- A Unitree Go2 robot accessible on your network
- The robot's IP address
- OpenAI/Claude/Alibaba API Key
sudo apt install python3-venv
# Clone the repository
git clone --recurse-submodules https://github.com/dimensionalOS/dimos-unitree.git
cd dimos-unitree
# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate
sudo apt install portaudio19-dev python3-pyaudio
# Install torch and torchvision if not already installed
pip install -r base-requirements.txt
# Install dependencies
pip install -r requirements.txt
# Copy and configure environment variables
cp default.env .env
Full functionality will require API keys for the following:
Requirements:
- OpenAI API key (required for all LLMAgents due to OpenAIEmbeddings)
- Claude API key (required for ClaudeAgent)
- Alibaba API key (required for Navigation skills)
These keys can be added to your .env file or exported as environment variables.
export OPENAI_API_KEY=<your private key>
export CLAUDE_API_KEY=<your private key>
export ALIBABA_API_KEY=<your private key>
- Ubuntu 22.04
- ROS2 Distros: Iron, Humble, Rolling
See Unitree Go2 ROS2 SDK for additional installation instructions.
mkdir -p ros2_ws
cd ros2_ws
git clone --recurse-submodules https://github.com/dimensionalOS/go2_ros2_sdk.git src
sudo apt install ros-$ROS_DISTRO-image-tools
sudo apt install ros-$ROS_DISTRO-vision-msgs
sudo apt install python3-pip clang portaudio19-dev
cd src
pip install -r requirements.txt
cd ..
# Ensure clean python install before running
source /opt/ros/$ROS_DISTRO/setup.bash
rosdep install --from-paths src --ignore-src -r -y
colcon build
# Change path to your Go2 ROS2 SDK installation
source /ros2_ws/install/setup.bash
source /opt/ros/$ROS_DISTRO/setup.bash
export ROBOT_IP="robot_ip" #for muliple robots, just split by ,
export CONN_TYPE="webrtc"
ros2 launch go2_robot_sdk robot.launch.py
# Change path to your Go2 ROS2 SDK installation
source /ros2_ws/install/setup.bash
python tests/run.py
cd dimos/web/dimos_interface
yarn install
yarn dev # you may need to run sudo if previously built via Docker
.
├── dimos/
│ ├── agents/ # Agent implementations
│ │ └── memory/ # Memory systems for agents, including semantic memory
│ ├── environment/ # Environment context and sensing
│ ├── hardware/ # Hardware abstraction and interfaces
│ ├── models/ # ML model definitions and implementations
│ │ ├── Detic/ # Detic object detection model
│ │ ├── depth/ # Depth estimation models
│ │ ├── segmentation/ # Image segmentation models
│ ├── perception/ # Computer vision and sensing
│ │ ├── detection2d/ # 2D object detection
│ │ └── segmentation/ # Image segmentation pipelines
│ ├── robot/ # Robot control and hardware interface
│ │ ├── global_planner/ # Path planning at global scale
│ │ ├── local_planner/ # Local navigation planning
│ │ └── unitree/ # Unitree Go2 specific implementations
│ ├── simulation/ # Robot simulation environments
│ │ ├── genesis/ # Genesis simulation integration
│ │ └── isaac/ # NVIDIA Isaac Sim integration
│ ├── skills/ # Task-specific robot capabilities
│ │ └── rest/ # REST API based skills
│ ├── stream/ # WebRTC and data streaming
│ │ ├── audio/ # Audio streaming components
│ │ └── video_providers/ # Video streaming components
│ ├── types/ # Type definitions and interfaces
│ ├── utils/ # Utility functions and helpers
│ └── web/ # DimOS development interface and API
│ ├── dimos_interface/ # DimOS web interface
│ └── websocket_vis/ # Websocket visualizations
├── tests/ # Test files
│ ├── genesissim/ # Genesis simulator tests
│ └── isaacsim/ # Isaac Sim tests
└── docker/ # Docker configuration files
├── agent/ # Agent service containers
├── interface/ # Interface containers
├── simulation/ # Simulation environment containers
└── unitree/ # Unitree robot specific containers
from dimos.robot.unitree.unitree_go2 import UnitreeGo2
from dimos.robot.unitree.unitree_skills import MyUnitreeSkills
from dimos.robot.unitree.unitree_ros_control import UnitreeROSControl
from dimos.agents.agent import OpenAIAgent
# Initialize robot
robot = UnitreeGo2(ip=robot_ip,
ros_control=UnitreeROSControl(),
skills=MyUnitreeSkills())
# Initialize agent
agent = OpenAIAgent(
dev_name="UnitreeExecutionAgent",
input_video_stream=robot.get_ros_video_stream(),
skills=robot.get_skills(),
system_query="Jump when you see a human! Front flip when you see a dog!",
model_name="gpt-4o"
)
while True: # keep process running
time.sleep(1)
Let's build a simple DimOS application with Agent chaining. We define a planner
as a PlanningAgent
that takes in user input to devise a complex multi-step plan. This plan is passed step-by-step to an executor
agent that can queue AbstractRobotSkill
commands to the ROSCommandQueue
.
Our reactive Pub/Sub data streaming architecture allows for chaining of Agent_0 --> Agent_1 --> ... --> Agent_n
via the input_query_stream
parameter in each which takes an Observable
input from the previous Agent in the chain.
Via this method you can chain together any number of Agents() to create complex dimensional applications.
web_interface = RobotWebInterface(port=5555)
robot = UnitreeGo2(ip=robot_ip,
ros_control=UnitreeROSControl(),
skills=MyUnitreeSkills())
# Initialize master planning agent
planner = PlanningAgent(
dev_name="UnitreePlanningAgent",
input_query_stream=web_interface<
8000
/span>.query_stream, # Takes user input from dimOS interface
skills=robot.get_skills(),
model_name="gpt-4o",
)
# Initialize execution agent
executor = OpenAIAgent(
dev_name="UnitreeExecutionAgent",
input_query_stream=planner.get_response_observable(), # Takes planner output as input
skills=robot.get_skills(),
model_name="gpt-4o",
system_query="""
You are a robot execution agent that can execute tasks on a virtual
robot. ONLY OUTPUT THE SKILLS TO EXECUTE.
"""
)
while True: # keep process running
time.sleep(1)
Call action primitives directly from Robot()
for prototyping and testing.
robot = UnitreeGo2(ip=robot_ip,)
# Call a Unitree WebRTC action primitive
robot.webrtc_req(api_id=1016) # "Hello" command
# Call a ROS2 action primitive
robot.move(distance=1.0, speed=0.5)
Create basic custom skills by inheriting from AbstractRobotSkill
and implementing the __call__
method.
class Move(AbstractRobotSkill):
distance: float = Field(...,description="Distance to reverse in meters")
def __init__(self, robot: Optional[Robot] = None, **data):
super().__init__(robot=robot, **data)
def __call__(self):
super().__call__()
return self._robot.move(distance=self.distance)
class JumpAndFlip(AbstractRobotSkill):
def __init__(self, robot: Optional[Robot] = None, **data):
super().__init__(robot=robot, **data)
def __call__(self):
super().__call__()
jump = Jump(robot=self._robot)
flip = Flip(robot=self._robot)
return (jump() and flip())
DimOS agents, such as OpenAIAgent
, can be endowed with capabilities through two primary mechanisms: by providing them with individual skill classes or with comprehensive SkillLibrary
instances. This design offers flexibility in how robot functionalities are defined and managed within your agent-based applications.
Agent's skills
Parameter
The skills
parameter in an agent's constructor is key to this integration:
-
A Single Skill Class: This approach is suitable for skills that are relatively self-contained or have straightforward initialization requirements.
- You pass the skill class itself (e.g.,
GreeterSkill
) directly to the agent'sskills
parameter. - The agent then takes on the responsibility of instantiating this skill when it's invoked. This typically involves the agent providing necessary context to the skill's constructor (
__init__
), such as aRobot
instance (or any other private instance variable) if the skill requires it.
- You pass the skill class itself (e.g.,
-
A
SkillLibrary
Instance: This is the preferred method for managing a collection of skills, especially when skills have dependencies, require specific configurations, or need to share parameters.- You first define your custom skill library by inheriting from
SkillLibrary
. Then, you create and configure an instance of this library (e.g.,my_lib = EntertainmentSkills(robot=robot_instance)
). - This pre-configured
SkillLibrary
instance is then passed to the agent'sskills
parameter. The library itself manages the lifecycle and provision of its contained skills.
- You first define your custom skill library by inheriting from
Examples:
First, define your skill. For instance, a GreeterSkill
that can deliver a configurable greeting:
class GreeterSkill(AbstractSkill):
"""Greats the user with a friendly message.""" # Gives the agent better context for understanding (the more detailed the better).
greeting: str = Field(..., description="The greating message to display.") # The field needed for the calling of the function. Your agent will also pull from the description here to gain better context.
def __init__(self, greeting_message: Optional[str] = None, **data):
super().__init__(**data)
if greeting_message:
self.greeting = greeting_message
# Any additional skill-specific initialization can go here
def __call__(self):
super().__call__() # Call parent's method if it contains base logic
# Implement the logic for the skill
print(self.greeting)
return f"Greeting delivered: '{self.greeting}'"
Next, register this skill class directly with your agent. The agent can then instantiate it, potentially with specific configurations if your agent or skill supports it (e.g., via default parameters or a more advanced setup).
agent = OpenAIAgent(
dev_name="GreetingBot",
system_query="You are a polite bot. If a user asks for a greeting, use your GreeterSkill.",
skills=GreeterSkill, # Pass the GreeterSkill CLASS
# The agent will instantiate GreeterSkill.
# If the skill had required __init__ args not provided by the agent automatically,
# this direct class passing might be insufficient without further agent logic
# or by passing a pre-configured instance (see SkillLibrary example).
# For simple skills like GreeterSkill with defaults or optional args, this works well.
model_name="gpt-4o"
)
In this setup, when the GreetingBot
agent decides to use the GreeterSkill
, it will instantiate it. If the GreeterSkill
were to be instantiated by the agent with a specific greeting_message
, the agent's design would need to support passing such parameters during skill instantiation.
Define the SkillLibrary and any skills it will manage in its collection:
class MovementSkillsLibrary(SkillLibrary):
"""A specialized skill library containing movement and navigation related skills."""
def __init__(self, robot=None):
super().__init__()
self._robot = robot
def initialize_skills(self, robot=None):
"""Initialize all movement skills with the robot instance."""
if robot:
self._robot = robot
if not self._robot:
raise ValueError("Robot instance is required to initialize skills")
# Initialize with all movement-related skills
self.add(Navigate(robot=self._robot))
self.add(NavigateToGoal(robot=self._robot))
self.add(FollowHuman(robot=self._robot))
self.add(NavigateToObject(robot=self._robot))
self.add(GetPose(robot=self._robot)) # Position tracking skill
Note the addision of initialized skills added to this collection above.
Proceed to use this skill library in an Agent:
Finally, in your main application code:
# 1. Create an instance of your custom skill library, configured with the robot
my_movement_skills = MovementSkillsLibrary(robot=robot_instance)
# 2. Pass this library INSTANCE to the agent
performing_agent = OpenAIAgent(
dev_name="ShowBot",
system_query="You are a show robot. Use your skills as directed.",
skills=my_movement_skills, # Pass the configured SkillLibrary INSTANCE
model_name="gpt-4o"
)
tests/run_go2_ros.py
: TestsUnitreeROSControl(ROSControl)
initialization inUnitreeGo2(Robot)
via direct function callsrobot.move()
androbot.webrtc_req()
tests/simple_agent_test.py
: Tests a simple zero-shot classOpenAIAgent
exampletests/unitree/test_webrtc_queue.py
: TestsROSCommandQueue
via a 20 back-to-back WebRTC requests to the robottests/test_planning_agent_web_interface.py
: Tests a simple two-stagePlanningAgent
chained to anExecutionAgent
with backend FastAPI interface.tests/test_unitree_agent_queries_fastapi.py
: Tests a zero-shotExecutionAgent
with backend FastAPI interface.
For detailed documentation, please visit our documentation site (Coming Soon).
We welcome contributions! See our Bounty List for open requests for contributions. If you would like to suggest a feature or sponsor a bounty, open an issue.
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Huge thanks to!
- The Roboverse Community and their unitree-specific help. Check out their Discord.
- @abizovnuralem for his work on the Unitree Go2 ROS2 SDK we integrate with for DimOS.
- @legion1581 for his work on the Unitree Go2 WebRTC Connect from which we've pulled the
Go2WebRTCConnection
class and other types for seamless WebRTC-only integration with DimOS. - @tfoldi for the webrtc_req integration via Unitree Go2 ROS2 SDK, which allows for seamless usage of Unitree WebRTC control primitives with DimOS.
- GitHub Issues: For bug reports and feature requests
- Email: build@dimensionalOS.com
- Agent() failure to execute Nav2 action primitives (move, reverse, spinLeft, spinRight) is almost always due to the internal ROS2 collision avoidance, which will sometimes incorrectly display obstacles or be overly sensitive. Look for
[behavior_server]: Collision Ahead - Exiting DriveOnHeading
in the ROS logs. Reccomend restarting ROS2 or moving robot from objects to resolve. docker-compose up --build
does not fully initialize the ROS2 environment due tostd::bad_alloc
errors. This will occur during continuous docker development if thedocker-compose down
is not run consistently before rebuilding and/or you are on a machine with less RAM, as ROS is very memory intensive. Reccomend running to clear your docker cache/images/containers withdocker system prune
and rebuild.