Generative Adversarial Imitation Learning

Jonathan Ho and Stefano Ermon

Contains an implementation of Trust Region Policy Optimization (Schulman et al., 2015).

Dependencies:

Provided files:

expert_policies/* are the expert policies, trained by TRPO (scripts/run_rl_mj.py) on the true costs
scripts/im_pipeline.py is the main training and evaluation pipeline. This script is responsible for sampling data from experts to generate training data, running the training code (scripts/imitate_mj.py), and evaluating the resulting policies.
pipelines/* are the experiment specifications provided to scripts/im_pipeline.py
results/* contain evaluation data for the learned policies

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
environments		environments
expert_policies		expert_policies
pipelines		pipelines
policyopt		policyopt
results		results
scripts		scripts
LICENSE		LICENSE
README.rst		README.rst