8000 New robotics environments by matthiasplappert · Pull Request #912 · openai/gym · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

New robotics environments #912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 124 commits into from
Feb 26, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
124 commits
Select commit Hold shift + click to select a range
cfbace5
all envs run offscreen
ethanabrooks Nov 8, 2017
1730538
render works
ethanabrooks Nov 8, 2017
5947270
changed mujoco-py version
ethanabrooks Nov 13, 2017
3fef1a1
Add fetch assets
Jan 17, 2018
cddf822
Merge branch 'mujoco-15' into fetch
Jan 17, 2018
e9d2ca3
Initial version of all envs with correct start position
Jan 17, 2018
17a6c0e
Acutation of robot
Jan 17, 2018
b4a4f51
Setup mostly done
Jan 17, 2018
3ab8617
Get observations
Jan 17, 2018
dfe540f
Observation space is specified by default
Jan 17, 2018
fffbd43
Correctly include box information
Jan 17, 2018
cdcbc7d
Properly reset
Jan 17, 2018
4db5cd9
No need to overwrite spaces
Jan 17, 2018
67d9e52
Remove more unnecessary overrides
Jan 17, 2018
1c3176c
Properly reset goals
Jan 17, 2018
bf45f8b
Simplify to case where we only ever have 1 box
Jan 18, 2018
8000
86b8070
Add reward computation
Jan 18, 2018
b78c690
Fix target visualization
Jan 19, 2018
8e86931
Fix state bug
Jan 19, 2018
f5efcf8
Simplify assets
Jan 19, 2018
63cf479
Properly lock gripper in place
Jan 19, 2018
e545e20
Clean up Gym
Jan 23, 2018
649b5ab
Update Gym API
Jan 23, 2018
2f1f302
Introduce special goal-based space since we rely on this structure fo…
Jan 23, 2018
e565ca0
Remove unused assets
Jan 23, 2018
fefafa6
Bump versions
Jan 24, 2018
e54fc68
Address code review feedback
Jan 24, 2018
08c27dc
Pass in initial_qpos on init
Jan 24, 2018
cc9201e
Import hand model
Jan 24, 2018
7257175
First rough draft of a working hand reaching environment without any …
Jan 24, 2018
420f684
Initial qpos
Jan 24, 2018
90e7b3f
Fully functional reach environment
Jan 24, 2018
889cbe1
Use robot_get_obs
Jan 24, 2018
9d64da3
Merge remote-tracking branch 'origin/master' into fetch
Jan 24, 2018
2a5a257
Fully working hand with sensible values
Jan 25, 2018
a10af8d
Merge branch 'fetch' into hand
Jan 25, 2018
4c43cc4
Running version of block with Z and XYZ rotations
Jan 25, 2018
f0b297f
Fully working block envs
Jan 25, 2018
9d1fcd6
Use new fancy dog texture
Jan 25, 2018
977daef
Split into multiple files
Jan 25, 2018
ee1c83a
Implement more varients of the hand
Jan 25, 2018
32e500f
Implement parallel rotations
Jan 25, 2018
1a69ae0
Improve hand setup
Jan 25, 2018
bd63272
Implement pen rotation task
Jan 25, 2018
9e75b9f
Add ellipsoid version
Jan 25, 2018
14a0d8e
Rename environments
Jan 25, 2018
ab121af
Re-randomize if an initial configuration tunrs out to be invalid
Jan 25, 2018
bc7bc40
Do not directly set goal
Jan 25, 2018
93b22f9
Simplify environment setup
Jan 26, 2018
2665802
Make the block smaller
Jan 26, 2018
b801625
Update Fetch to new goal-based API
Jan 29, 2018
d7c9fa9
Add wrapper to flatten dict observation into an array
Jan 29, 2018
912a0ca
Merge remote-tracking branch 'origin/master' into hand
Jan 29, 2018
f00632b
Fix environment to work with changed Gym
Jan 29, 2018
b2dc092
Group similar things together
Jan 29, 2018
282a855
Move things around
Jan 29, 2018
e17a24f
Refactor into separate classes
Jan 29, 2018
6e3da41
Clean up manipulation code
Jan 29, 2018
a3f532f
Minor clean up
Jan 29, 2018
bfc6c6d
More clean up
Jan 29, 2018 8000
f407dc7
Use seeded random numbers
Jan 29, 2018
d205492
Add extras
Jan 29, 2018
7faf9f7
Document remaining methods
Jan 29, 2018
5fb2b3b
Fix wrapper
Jan 29, 2018
87d10ed
Fix bug
Jan 30, 2018
92083c4
Add documentation
Jan 30, 2018
45e3486
Minor fixes
Jan 30, 2018
25ee65e
Bake action range into environment
Jan 30, 2018
557e713
Forward new method in wrapper
Jan 30, 2018
e0f0abe
Good default camera angle
Jan 30, 2018
c94ad2b
Better looking fetch
Jan 30, 2018
97369c3
Add acknowledgmennts for models
Jan 30, 2018
d697641
Rename to LICENSE
Jan 30, 2018
1b5c1c3
Add support for dense vs. sparse version
Jan 30, 2018
e573a3e
Increase damping coefficient
Jan 30, 2018
acb9db2
Minor fixes
Jan 31, 2018
a416f61
Include box position
Jan 31, 2018
52ae18e
A simple script to render an environment for quick inspection
Jan 31, 2018
e44c76c
Make sure that we don't change the action outside of the current scope
Jan 31, 2018
39d3f0b
Update fetch environments to have sufficient light
Jan 31, 2018
2d788a7
Ignore MuJoCo warnings in step
Feb 1, 2018
89137d9
Rename method
Feb 1, 2018
4ba6a6d
Address most feedback from code review
Feb 1, 2018
ef79380
Add additional documentation
Feb 1, 2018
d612429
Clarify role of desired goal and achieved goal
Feb 1, 2018
6552b44
Enforce action range to avoid simulation instabilities
Feb 1, 2018
5f01546
Fix pen manipulation
Feb 1, 2018
f70cd5d
Change objects
Feb 2, 2018
fda768d
Add easter egg
Feb 2, 2018
c34acbf
Update pen texture
Feb 2, 2018
70e74c4
Fix target computation
Feb 2, 2018
d7f6532
Fix pen environment
Feb 2, 2018
c73fc7c
Fix
Feb 2, 2018
d399de5
Implement easter egg mode
Feb 5, 2018
f986a04
Move floor down a bit to avoid that the hand can cheat
Feb 5, 2018
64fec19
Do not catch exceptions by default
Feb 5, 2018
144d464
Use free joint for Fetch tasks
Feb 7, 2018
0f8b08e
Use quaternions for rotations because setting Euler angles can end up…
Feb 7, 2018
6726d45
Decrease tolerance for manipulation tasks. Now 1cm for position and ~…
Feb 8, 2018
bdb42a4
Fix linter issue
Feb 8, 2018
5ddbb49
Fix reaching for the ring finger
Feb 8, 2018
4e44c09
Require a bit more precision for hand reaching
Feb 8, 2018
f67a311
Merge remote-tracking branch 'origin/master' into robotics-envs
Feb 8, 2018
da9264d
Make pen spinning a bit simpler
Feb 9, 2018
a97dc6a
Weigh position offset accordingly
Feb 9, 2018
34aaef4
Remove GoalDict
Feb 12, 2018
db3b688
Fix hand reach
Feb 12, 2018
135a1b8
Ring finger behaves strange
Feb 13, 2018
be817ed
Fix hand reach
Feb 13, 2018
3145acf
Forward step for visualization
Feb 13, 2018
1e27b3d
Minor fixes
Feb 13, 2018
6ac8cf9
Include final texture for hidden block
Feb 13, 2018
b98e5b0
Update Fetch environment skins
Feb 15, 2018
c00af8a
Apply final look to hand environments
Feb 15, 2018
7498c5f
Skin pen
Feb 15, 2018
4927ef5
Remove unused texture
Feb 15, 2018
913532a
Merge branch 'master' into robotics-envs
Feb 15, 2018
b9d8ac5
Add alias for all hand manipulation envs
Feb 15, 2018
a7d3eeb
Add markers so that top and bottom of pen can be differntiated
Feb 15, 2018
aa5839d
Change wrapper name
Feb 20, 2018
e291341
Fix rendering to image bug
Feb 26, 2018
4480f9a
Bump version
Feb 26, 2018
d0ddc35
Include robotics in README
Feb 26, 2018
a5b7e81
Fix visualizer for pendulum
Feb 26, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 56 additions & 38 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,12 @@ If you're not sure where to start, we recommend beginning with the

A whitepaper for OpenAI Gym is available at http://arxiv.org/abs/1606.01540, and here's a BibTeX entry that you can use to cite it in a publication::

@misc{1606.01540,
Author = {Greg Brockman and Vicki Cheung and Ludwig Pettersson and Jonas Schneider and John Schulman and Jie Tang and Wojciech Zaremba},
Title = {OpenAI Gym},
Year = {2016},
Eprint = {arXiv:1606.01540},
}
@misc{1606.01540,
Author = {Greg Brockman and Vicki Cheung and Ludwig Pettersson and Jonas Schneider and John Schulman and Jie Tang and Wojciech Zaremba},
Title = {OpenAI Gym},
Year = {2016},
Eprint = {arXiv:1606.01540},
}

.. contents:: **Contents of this document**
:depth: 2
Expand Down Expand Up @@ -50,15 +50,15 @@ You can perform a minimal install of ``gym`` with:

.. code:: shell

git clone https://github.com/openai/gym.git
cd gym
pip install -e .
git clone https://github.com/openai/gym.git
cd gym
pip install -e .

If you prefer, you can do a minimal install of the packaged version directly from PyPI:

.. code:: shell

pip install gym
pip install gym

You'll be able to run a few environments right away:

Expand All @@ -80,13 +80,13 @@ On OSX:

.. code:: shell

brew install cmake boost boost-python sdl2 swig wget
brew install cmake boost boost-python sdl2 swig wget

On Ubuntu 14.04:

.. code:: shell

apt-get install -y python-numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl libboost-all-dev libsdl2-dev swig
apt-get install -y python-numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl libboost-all-dev libsdl2-dev swig

MuJoCo has a proprietary dependency we can't set up for you. Follow
the
Expand All @@ -102,7 +102,7 @@ We currently support Linux and OS X running Python 2.7 or 3.5. Some users on OSX

.. code:: shell

brew install boost-python --with-python3
brew install boost-python --with-python3

If you want to access Gym from languages other than python, we have limited support for non-python
frameworks, such as lua/Torch, using the OpenAI Gym `HTTP API <https://github.com/openai/gym-http-api>`_.
Expand Down Expand Up @@ -154,10 +154,10 @@ sequence.

.. code:: python

import gym
env = gym.make('Copy-v0')
env.reset()
env.render()
import gym
env = gym.make('Copy-v0')
env.reset()
env.render()

Atari
-----
Expand All @@ -166,10 +166,10 @@ The Atari environments are a variety of Atari video games. If you didn't do the

.. code:: python

import gym
env = gym.make('SpaceInvaders-v0')
env.reset()
env.render()
import gym
env = gym.make('SpaceInvaders-v0')
env.reset()
env.render()

This will install ``atari-py``, which automatically compiles the `Arcade Learning Environment <http://www.arcadelearningenvironment.org/>`_. This can take quite a while (a few minutes on a decent laptop), so just be prepared.

Expand All @@ -180,10 +180,10 @@ Box2d is a 2D physics engine. You can install it via ``pip install -e '.[box2d]

.. code:: python

import gym
env = gym.make('LunarLander-v2')
env.reset()
env.render()
import gym
env = gym.make('LunarLander-v2')
env.reset()
env.render()

Classic control
---------------
Expand All @@ -192,10 +192,10 @@ These are a variety of classic control tasks, which would appear in a typical re

.. code:: python

import gym
env = gym.make('CartPole-v0')
env.reset()
env.render()
import gym
env = gym.make('CartPole-v0')
env.reset()
env.render()

MuJoCo
------
Expand All @@ -208,10 +208,27 @@ to set it up. You'll have to also run ``pip install -e '.[mujoco]'`` if you didn

.. code:: python

import gym
env = gym.make('Humanoid-v1')
env.reset()
env.render()
import gym
env = gym.make('Humanoid-v2')
env.reset()
env.render()


Robotics
------

`MuJoCo <http://www.mujoco.org/>`_ is a physics engine which can do
very detailed efficient simulations with contacts and we use it for all robotics environments. It's not
open-source, so you'll have to follow the instructions in `mujoco-py
<https://github.com/openai/mujoco-py#obtaining-the-binaries-and-license-key>`_
to set it up. You'll have to also run ``pip install -e '.[mujoco]'`` if you didn't do the full install.

.. code:: python

import gym
env = gym.make('HandManipulateBlock-v0')
env.reset()
env.render()

Toy text
--------
Expand All @@ -220,10 +237,10 @@ Toy environments which are text-based. There's no extra dependency to install, s

.. code:: python

import gym
env = gym.make('FrozenLake-v0')
env.reset()
env.render()
import gym
env = gym.make('FrozenLake-v0')
env.reset()
env.render()

Examples
========
Expand All @@ -241,14 +258,15 @@ We are using `pytest <http://doc.pytest.org>`_ for tests. You can run them via:

.. code:: shell

pytest
pytest


.. _See What's New section below:

What's new
==========

- 2018-02-28: Release of a set of new robotics environments.
- 2018-01-25: Made some aesthetic improvements and removed unmaintained parts of gym. This may seem like a downgrade in functionality, but it is actually a long-needed cleanup in preparation for some great new things that will be released in the next month.

+ Now your `Env` and `Wrapper` subclasses should define `step`, `reset`, `render`, `close`, `seed` rather than underscored method names.
Expand Down
21 changes: 21 additions & 0 deletions bin/render.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env python3
import argparse
import gym


parser = argparse.ArgumentParser(description='Renders a Gym environment for quick inspection.')
parser.add_argument('env_id', type=str, help='the ID of the environment to be rendered (e.g. HalfCheetah-v1')
parser.add_argument('--step', type=int, default=1)
args = parser.parse_args()

env = gym.make(args.env_id)
env.reset()

step = 0
while True:
if args.step:
env.step(env.action_space.sample())
env.render()
if step % 10 == 0:
env.reset()
step += 1
2 changes: 1 addition & 1 deletion gym/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from gym.utils import reraise
from gym.version import VERSION as __version__

from gym.core import Env, Space, Wrapper, ObservationWrapper, ActionWrapper, RewardWrapper
from gym.core import Env, GoalEnv, Space, Wrapper, ObservationWrapper, ActionWrapper, RewardWrapper
from gym.envs import make, spec
from gym import wrappers, spaces, logger

Expand Down
44 changes: 44 additions & 0 deletions gym/core.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from gym import logger
import numpy as np

import gym
from gym import error
from gym.utils import closer

Expand Down Expand Up @@ -150,6 +151,46 @@ def __str__(self):
else:
return '<{}<{}>>'.format(type(self).__name__, self.spec.id)


class GoalEnv(Env):
"""A goal-based environment. It functions just as any regular OpenAI Gym environment but it
imposes a required structure on the observation_space. More concretely, the observation
space is required to contain at least three elements, namely `observation`, `desired_goal`, and
`achieved_goal`. Here, `desired_goal` specifies the goal that the agent should attempt to achieve.
`achieved_goal` is the goal that it currently achieved instead. `observation` contains the
actual observations of the environment as per usual.
"""

def reset(self):
# Enforce that each GoalEnv uses a Goal-compatible observation space.
if not isinstance(self.observation_space, gym.spaces.Dict):
raise error.Error('GoalEnv requires an observation space of type gym.spaces.Dict')
result = super(GoalEnv, self).reset()
for key in ['observation', 'achieved_goal', 'desired_goal']:
if key not in result:
raise error.Error('GoalEnv requires the "{}" key to be part of the observation dictionary.'.format(key))
return result

def compute_reward(self, achieved_goal, desired_goal, info):
"""Compute the step reward. This externalizes the reward function and makes
it dependent on an a desired goal and the one that was achieved. If you wish to include
additional rewards that are independent of the goal, you can include the necessary values
to derive it in info and compute it accordingly.

Args:
achieved_goal (object): the goal that was achieved during execution
desired_goal (object): the desired goal that we asked the agent to attempt to achieve
info (dict): an info dictionary with additional information

Returns:
float: The reward that corresponds to the provided achieved goal w.r.t. to the desired
goal. Note that the following should always hold true:

ob, reward, done, info = env.step()
assert reward == env.compute_reward(ob['achieved_goal'], ob['goal'], info)
"""
raise NotImplementedError()

# Space-related abstractions

class Space(object):
Expand Down Expand Up @@ -249,6 +290,9 @@ def close(self):
def seed(self, seed=None):
return self.env.seed(seed)

def compute_reward(self, achieved_goal, desired_goal, info):
return self.env.compute_reward(achieved_goal, desired_goal, info)

def __str__(self):
return '<{}{}>'.format(type(self).__name__, self.env)

Expand Down
Loading
0