Skip to content

Code-SorceryLab/DRL-Agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Game RL Framework

A Deep Reinforcement Learning framework for training, evaluating, and analysing agents across four games. Supports multiple algorithms, configurable reward personas, difficulty presets, and a full metrics/graphing pipeline all driven through an interactive menu.


Games

Key Game Termination
flappy Flappy Bird Hits pipe or ground
snake Snake Wall/self collision, or 1-min apple inactivity timeout
pong Pong Ball goes out of bounds
dk Donkey Kong Hit by barrel / falls

Algorithms

Key Algorithm
ppo PPO
rppo RecurrentPPO
trpo TRPO
a2c A2C

Quick Start

python -m venv venv
venv\Scripts\activate      
pip install -U pip
pip install -r requirements.txt
python menu.py

Menu Options

1.  Run Training
2.  Run Evaluation          (pick between normal or unlimited, pick game or all, pick difficulty)
3.  View TensorBoard Logs
4.  Play Game Manually      (keyboard, pick difficulty)
5.  Show Project Status
6.  Run Random Baseline     (pick game or all, pick difficulty)
7.  Watch Trained Agent     (pick game + model + difficulty)
8.  Train All Models for One Game
9.  Train Complete Grid
10. Metrics / Graphs
11. Record Gameplay         (save MP4 or GIF)
12. Delete Logs / Models
13. Exit

Repository Structure

DLR/
├── menu.py                         # Interactive menu (main entry point)
├── requirements.txt
├── pyproject.toml
├── games.md                        # Game-specific notes
├── tensorboard.md                  # TensorBoard usage
│
└── code/
    ├── conf/
    │   ├── grid.yaml               # Global training config (games, models, personas, skills)
    │   ├── algo/                   # Algorithm hyperparameters (ppo, rppo, trpo, a2c)
    │   ├── game/                   # Game configs (snake, flappy, pong, dk)
    │   ├── reward/                 # Persona reward configs
    │   │   ├── snake_baseline.yaml
    │   │   ├── flappy_baseline.yaml
    │   │   ├── pong_baseline.yaml
    │   │   └── dk_baseline.yaml
    │   ├── robustness/             # Difficulty overrides (easy / default / hard per game)
    │   └── callback/
    │
    ├── games/
    │   ├── snake_core.py
    │   ├── flappy_core.py
    │   ├── pong_core.py
    │   └── dk_core.py
    │
    ├── rewards/                    # Reward function implementations
    ├── metrics/                    # Per-game metrics collectors
    │   ├── snake_balance.py
    │   ├── flappy_balance.py
    │   ├── pong_balance.py
    │   └── dk_balance.py
    │
    ├── wrappers/
    │   └── generic_env.py          # Universal Gym wrapper (reward fn, HUD, dt, apple timeout)
    │
    └── scripts/
        ├── train.py
        ├── evaluate.py
        ├── manual_play.py
        ├── watch_agent.py
        ├── random_eval.py
        ├── record_gameplay.py
        ├── analyze_metrics.py
        ├── metrics_utils.py
        ├── callbacks.py
        └── play.py

Configuration

grid.yaml

The central config for grid-based experiments. Key fields:

games:    [flappy, snake, pong, dk]
models:   [ppo, rppo, trpo, a2c]
personas: [flappy_baseline, snake_baseline, pong_baseline, dk_baseline]

seed: 1234
device: cpu
n_envs: 10

skills:
  Custom: 10000000    # timesteps; override with +skills.Custom=N

Difficulty / Robustness

Each game has three difficulty configs in code/conf/robustness/:

  • <game>_default.yaml training conditions
  • <game>_easy.yaml forgiving settings
  • <game>_hard.yaml punishing settings

These override game parameters (grid size, speed, penalties, etc.) at evaluation and manual play time.

Persona Configs

Live in code/conf/reward/, named <game>_<persona>.yaml. They control reward shaping via a pluggable reward function passed into GameEnv.

Algorithm Configs

Live in code/conf/algo/. Each file sets hyperparameters for its algorithm (learning rate, n_steps, batch size, etc.).


Training

Via menu (option 1, 8, or 9), or directly:

python -m code.scripts.train game=snake model=ppo persona=snake_baseline skill=Custom +skills.Custom=5000000

Outputs:

  • Best model: models/best/<game>_<algo>_<persona>_<skill>/best_model.zip
  • Checkpoints: models/checkpoints/
  • TensorBoard logs: mylogs/
  • Eval logs: models/eval_logs/

Evaluation

Via menu (option 2), or directly:

python -m code.scripts.evaluate --game snake --algo ppo --difficulty default --episodes 100

Outputs a CSV: <game>_<algo>_eval.csv

Columns: episode, game, algo, difficulty, score, apples, pipes, episode_return, episode_steps, terminated, truncated, forced_timeout

Difficulty options: easy, default, hard, all


Manual Play (option 4)

Controls:
  Snake:  W/Up, S/Down, A/Left, D/Right, ESC
  Flappy: SPACE = flap, ESC
  Pong:   W/Up = up, S/Down = down, ESC
  DK:     W/A/S/D = move, SPACE = jump, ESC

Watch Agent (option 7)

Renders a trained model playing in real time. Picks game, model, and difficulty from the menu.


Random Baseline (option 6)

Runs a random-action agent for comparison. Same CSV format as evaluation.


Record Gameplay (option 11)

Saves an MP4 or GIF of a trained agent or manual play session. Output goes to outputs/.


Metrics and Graphs (option 10)

Generates plots from training and evaluation CSVs. All outputs written to models/metrics/.

Available graphs:

  • Reward vs timesteps (training curves)
  • Score (apples / pipes) vs timesteps
  • Evaluation bar charts comparing algorithms

Extending the Project

Add a Persona

  1. Create code/conf/reward/<game>_<persona>.yaml
  2. Implement reward logic in code/rewards/
  3. Add persona name to grid.yaml

Add a Game

  1. Implement code/games/<game>_core.py (must expose get_action_space, get_observation_space, reset, step, render)
  2. Add code/conf/game/<game>.yaml with _target_ pointing to the core class
  3. Add robustness configs: code/conf/robustness/<game>_default/easy/hard.yaml
  4. Add a persona config and metrics collector
  5. Register the game in grid.yaml

Additional Docs

  • games.md per-game design notes
  • tensorboard.md how to launch and read TensorBoard

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages