GitHub - Farzan-Kh/25Blackjack: A custom black jack game, with Monte Carlo Algorithm (RL)

Overview

This project implements a custom Blackjack environment, Blackjack25Env, designed for reinforcement learning experiments. It is based on OpenAI Gym and extends the classic Blackjack game with a modified rule set. The project includes tools for training agents using Monte Carlo control, visualizing policies and state-value functions, and simulating games.

The environment supports rendering with pygame for visualizing the game state and provides a complete pipeline for reinforcement learning, from environment creation to policy evaluation.

Features

Custom Blackjack Environment:
- Player's goal is to achieve a hand total as close to 25 as possible without exceeding it.
- Includes rules for "usable aces" and "natural blackjack" (sum of 25 with an ace and a ten).
- Fully compatible with OpenAI Gym's API.
Reinforcement Learning:
- Monte Carlo control algorithm implemented to train agents.
- Epsilon-greedy policy for exploration and exploitation.
Visualization:
- 3D plots of state-value functions.
- Heatmaps of optimal policies.
Simulation:
- Play games with random or trained policies.
- Render the game visually using pygame.

Project Structure

.
├── blackjack25.py          # Implementation of the Blackjack25Env environment.
├── blackjack25_env.py      # Environment registration for OpenAI Gym.
├── Monte_Carlo_Solution.ipynb  # Notebook for training and evaluating agents.
├── plot_utils.py           # Utility functions for plotting state values and policies.
├── __pycache__/            # Compiled Python files (auto-generated).

Installation

Clone the repository:

git clone https://github.com/your-repo/blackjack25.git
cd blackjack25

Register the custom environment:
```
import blackjack25_env
```

Verify installation:

import gym
env = gym.make('Blackjack25-v0')
print(env.observation_space)
print(env.action_space)

Usage

1. Environment Overview

The environment is defined in blackjack25.py as Blackjack25Env. Key details:

Observation Space:
- A tuple (player_sum, dealer_card, usable_ace):
  - player_sum: Player's current hand total (0–31).
  - dealer_card: Dealer's face-up card (1–10).
  - usable_ace: Whether the player has a usable ace (True or False).
Action Space:
- STICK = 0: Stop drawing cards.
- HIT = 1: Draw another card.
Rewards:
- +1: Player wins.
- -1: Player loses.
- 0: Draw.
- +1.5: Bonus for "natural blackjack" (if enabled).

2. Training an Agent

The notebook Monte_Carlo_Solution.ipynb demonstrates how to train an agent using Monte Carlo control.

Key Functions:

mc_control(env, num_episodes, alpha, gamma): Trains an agent using Monte Carlo control.
generate_episode_from_Q(env, Q, epsilon, nA): Generates episodes using an epsilon-greedy policy.
update_Q(env, episode, Q, alpha, gamma): Updates the action-value function.

Example:

from Monte_Carlo_Solution import mc_control

# Train the agent
policy, Q = mc_control(env, num_episodes=500000, alpha=0.02)

# Visualize the state-value function
from plot_utils import plot_blackjack_values
V = dict((k, max(v)) for k, v in Q.items())
plot_blackjack_values(V)

# Visualize the policy
from plot_utils import plot_policy
plot_policy(policy)

3. Simulating Games

You can simulate games with a random or trained policy. Example:

state = env.reset()
while True:
    action = env.action_space.sample()  # Replace with trained policy for optimal play
    state, reward, done, _, _ = env.step(action)
    if done:
        print(f"Game Over! Reward: {reward}")
        break

4. Rendering the Game

The environment supports rendering with pygame. To enable rendering:

env = gym.make('Blackjack25-v0', render_mode='human')
env.reset()
env.step(env.action_space.sample())

Visualization

The file plot_utils.py provides functions for visualizing the results:

State-Value Function:

from plot_utils import plot_blackjack_values
plot_blackjack_values(V)

Policy Heatmap:

from plot_utils import plot_policy
plot_policy(policy)

Custom Rules

The environment supports custom rules via the natural and sab flags:

natural=True: Enables a bonus reward for "natural blackjack."
sab=True: Follows rules from Sutton and Barto's book.

Example:

env = gym.make('Blackjack25-v0', natural=True, sab=False)

Dependencies

Python 3.8+
gym
pygame
numpy
matplotlib

Install dependencies with:

pip install gym pygame numpy matplotlib

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch for your feature or bugfix.
Submit a pull request with a detailed description.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

OpenAI Gym for the base environment framework.
Sutton and Barto's "Reinforcement Learning: An Introduction" for inspiration.
Pixel art by Mariia Khmelnytska (source).

Contact

For questions or feedback, please open an issue or contact the repository owner.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Features

Project Structure

Installation

Usage

1. Environment Overview

2. Training an Agent

Key Functions:

Example:

3. Simulating Games

4. Rendering the Game

Visualization

Custom Rules

Dependencies

Contributing

License

Acknowledgments

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Monte_Carlo_Solution.ipynb		Monte_Carlo_Solution.ipynb
README.md		README.md
blackjack25.py		blackjack25.py
blackjack25_env.py		blackjack25_env.py
plot_utils.py		plot_utils.py

License

Farzan-Kh/25Blackjack

Folders and files

Latest commit

History

Repository files navigation

Overview

Features

Project Structure

Installation

Usage

1. Environment Overview

2. Training an Agent

Key Functions:

Example:

3. Simulating Games

4. Rendering the Game

Visualization

Custom Rules

Dependencies

Contributing

License

Acknowledgments

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages