My solutions to the programming exercises in "Reinforcement Learning: An Introduction" (2nd Edition) [Sutton & Barto, 2018]
-
Chapter 2: Multi-armed Bandits
-
Chapter 4: Dynamic Programming
-
Chapter 5: Monte Carlo Methods
-
Chapter 6: Temporal-Difference Learning
-
Chapter 8: Planning and Learning with Tabular Methods
1 Original code by Sutton & Barto written in LISP
Any bug reports and suggestions are welcome! Please feel free to contact me at jugoslav.stojcheski<at>gmail.com