PPO Implementation in Pytorch for LunarLander-v2

Being fastinated by "IMPLEMENTATION MATTERS IN DEEP POLICY GRADIENTS: A CASE STUDY ON PPO AND TRPO", I wrote PPO code in PyTorch to see if the code-level optimizations work for LunarLander-v2. And they do! for some extent.

How to train

Find a config .yaml file in the config directory and run the following command. You can make your own .yaml file, but make sure they have all the necessary options.

$ python main.py --config PPO_M.yaml

How to play

$ python main.py --config PPO_M.yaml --eval

How to run Bayesian Optimization for hyperparameters

$ python search.py

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
agents		agents
config		config
experiments		experiments
utils		utils
.DS_Store		.DS_Store
GAIL Result Comparison.ipynb		GAIL Result Comparison.ipynb
HyperOpt_result_and_run_all.ipynb		HyperOpt_result_and_run_all.ipynb
PPO Result Comparison.ipynb		PPO Result Comparison.ipynb
README.md		README.md
behaviour_cloning.py		behaviour_cloning.py
gail.py		gail.py
lunarlander_dataset.py		lunarlander_dataset.py
main.py		main.py
models.py		models.py
search.py		search.py
search_result.pth		search_result.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PPO Implementation in Pytorch for LunarLander-v2

How to train

How to play

How to run Bayesian Optimization for hyperparameters

Reference

About

Uh oh!

Releases

Packages

Languages

junkwhinger/PPO_PyTorch

Folders and files

Latest commit

History

Repository files navigation

PPO Implementation in Pytorch for LunarLander-v2

How to train

How to play

How to run Bayesian Optimization for hyperparameters

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages