BlueROV2 Gymnasium Environment

A Gymnasium environment for simulating and training reinforcement learning agents on the BlueROV2 underwater vehicle. This environment provides a realistic simulation of the BlueROV2's dynamics and supports various control tasks.

🌊 Features

Realistic Physics: Implements validated hydrodynamic model of the BlueROV2
3D Visualization: Real-time 3D rendering using Meshcat
Custom Rewards: Configurable reward functions for different tasks
Disturbance Modeling: Includes environmental disturbances for realistic underwater conditions
Stable-Baselines3 Compatible: Ready to use with popular RL frameworks
Customizable Environment: Easy to modify for different underwater tasks
(Future release: spawn multiple AUVs)

🛠️ Installation

Prerequisites

Python ≥3.10
uv (recommended) or pip

Using uv (Recommended)

# Clone the repository
git clone https://github.com/gokulp01/bluerov2_gym.git
cd bluerov2_gym

# Create and activate a virtual environment
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install the package
uv pip install -e .

Using pip

# Clone the repository
git clone https://github.com/gokulp01/bluerov2_gym.git
cd bluerov2_gym

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install the package
pip install -e .

🎮 Usage

Basic Usage

import gymnasium as gym
import bluerov2_gym

# Create the environment
env = gym.make("BlueRov-v0", render_mode="human")

# Reset the environment
observation, info = env.reset()

# Run a simple control loop
while True:
    # Take a random action
    action = env.action_space.sample()
    observation, reward, terminated, truncated, info = env.step(action)
    
    if terminated or truncated:
        observation, info = env.reset()

Training with Stable-Baselines3 (refer to examples/train.py for full code example)

from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize

# Create and wrap the environment
env = gym.make("BlueRov-v0")
env = DummyVecEnv([lambda: env])
env = VecNormalize(env)

# Initialize the agent
model = PPO("MultiInputPolicy", env, verbose=1)

# Train the agent
model.learn(total_timesteps=1_000_000)

# Save the trained model
model.save("bluerov_ppo")

🎯 Environment Details

State Space

The environment uses a Dictionary observation space containing:

x, y, z: Position coordinates
theta: Yaw angle
vx, vy, vz: Linear velocities
omega: Angular velocity

Action Space

Continuous action space with 4 dimensions:

Forward/Backward thrust
Left/Right thrust
Up/Down thrust
Yaw rotation

Reward Function

The default reward function considers:

Position error from target
Velocity penalties
Orientation error
Custom rewards can be implemented by extending the Reward class

📊 Examples

The examples directory contains several scripts demonstrating different uses:

test.py: Basic environment testing with manual control and evaluation with trained model
train.py: Training script using PPO

Running Examples

# Test environment with manual control
python examples/test.py

# Train an agent
python examples/train.py

🖼️ Visualization

The environment uses Meshcat for 3D visualization. When running with render_mode="human", a web browser window will open automatically showing the simulation. The visualization includes:

Water surface effects
Underwater environment
ROV model
Ocean floor with decorative elements (I am no good at this)

📚 Project Structure

bluerov2_gym/
├── bluerov2_gym/              # Main package directory
│   ├── assets/               # 3D models and resources
│   └── envs/                 # Environment implementation
│       ├── core/            # Core components
│       │   ├── dynamics.py  # Physics simulation
│       │   ├── rewards.py   # Reward functions
│       │   ├── state.py     # State management
│       │   └── visualization/
│       │       └── renderer.py  # 3D visualization
│       └── bluerov_env.py    # Main environment class
├── examples/                  # Example scripts
├── tests/                    # Test cases
└── README.md

🔧 Configuration

The environment can be configured through various parameters:

Physics parameters in dynamics.py
Reward weights in rewards.py
Visualization settings in renderer.py

📝 Citation

If you use this environment in your research, please cite:

@article{puthumanaillam2024tabfieldsmaximumentropyframework,
title={TAB-Fields: A Maximum Entropy Framework for Mission-Aware Adversarial Planning},
author={Gokul Puthumanaillam and Jae Hyuk Song and Nurzhan Yesmagambet and Shinkyu Park and Melkior Ornik},
year={2024},
eprint={2412.02570},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2412.02570} } 
}

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License

🙏 Acknowledgements

BlueRobotics for the BlueROV2 specifications
OpenAI/Farama Foundation for the Gymnasium framework
Meshcat for the visualization library

📧 Contact

Gokul Puthumanaillam - @gokulp01 - [gokulp2@illinois.edu]

Project Link: https://github.com/gokulp01/bluerov2_gym

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
bluerov2_gym		bluerov2_gym
docs		docs
examples		examples
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BlueROV2 Gymnasium Environment

🌊 Features

🛠️ Installation

Prerequisites

Using uv (Recommended)

Using pip

🎮 Usage

Basic Usage

Training with Stable-Baselines3 (refer to examples/train.py for full code example)

🎯 Environment Details

State Space

Action Space

Reward Function

📊 Examples

Running Examples

🖼️ Visualization

📚 Project Structure

🔧 Configuration

📝 Citation

🤝 Contributing

📄 License

🙏 Acknowledgements

📧 Contact

About

Uh oh!

Releases

Packages

Languages

gokulp01/bluerov2_gym

Folders and files

Latest commit

History

Repository files navigation

BlueROV2 Gymnasium Environment

🌊 Features

🛠️ Installation

Prerequisites

Using uv (Recommended)

Using pip

🎮 Usage

Basic Usage

Training with Stable-Baselines3 (refer to examples/train.py for full code example)

🎯 Environment Details

State Space

Action Space

Reward Function

📊 Examples

Running Examples

🖼️ Visualization

📚 Project Structure

🔧 Configuration

📝 Citation

🤝 Contributing

📄 License

🙏 Acknowledgements

📧 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages