In Affordance Learning for End-to-End Visuomotor Robot Control, we introduced a modular deep neural network structure, that detects a container on a table, and inserts a ball into it.
We showed that our system performs its task successfully in zero-shot sim-to-real transfer manner. Each part of our system was fully trained either with synthetic data or in a simulation. The system was invariant to, e.g., distractor objects and textures.
We have didvided our work into the following code blocks:
- AffordanceVAED extracts affordance information from an observation image, and represents it as a latent space vector. Figure 1 shows the structure of the model.
- BlenderDomainRandomizer generates a domain randomized dataset for VAED.
- TrajectoryVAE represents trajectories in a low-dimensional latent space, and generates a trajectory based on a given latent vector.
- affordance_gym generates training data for TrajectoryVAE, and combines VAED and TrajectoryVAE together to perform desired trajectories based on an observation.
TrajectoryVAE represents task suitable trajectories in a low-dimensional latent space. Training data includes joint pose trajectories that are generated by a planning algorithm.
Install required depedencies To install pip install -r requirements.txt
.
- Generate training data with affordance_gym (scripts/generate_trajectories.py).
- Run
python main.py -h
to see how to include the generated training data and explore rest of the running options.