Reinforcement Learning Snake Game

Snake AI gameplay demonstration showing learned behavior

Live gameplay demonstration of the trained Q-Learning agent navigating the board.

About the Project

A reinforcement learning agent that learns to play Snake using tabular Q-Learning in a custom Python environment. The agent improves purely through interaction with the game—no scripted strategy—learning a policy that balances reaching food with avoiding collisions.

Highlights

Custom Snake environment built for RL-style training and evaluation
Tabular Q-Learning with epsilon-greedy exploration and decay
Compact state encoding (8-bit binary, 256 discrete states) to keep learning feasible and fast
Reward design to speed up learning and reduce random wandering early on
Reproducible experiments with seeded runs and logged results
Visual outputs: training curves and gameplay GIFs

How it works

At each step, the agent observes a compact representation of the board (food direction + immediate collision risks), selects an action, receives a reward signal, and updates its Q-table via the Bellman update:

Q(s,a) ← Q(s,a) + α(r + γ max Q(s',a') − Q(s,a))

Key hyperparameters: Learning rate α = 0.1, discount factor γ = 0.9, epsilon decay = 0.9995 per episode (0.2 → 0.05)

Performance: Best average score of 33.20, highest single score of 62 (24.2% board coverage) with ~15,700 training episodes.

What I learned

MDPs and value functions, exploration vs exploitation trade-offs
State design to prevent state-space explosion (8-bit encoding reduces 2^32 potential states to just 256)
Reward shaping for faster convergence—sparse rewards slow learning, so designed multi-tier structure: food eating (+10 + length×0.5), moving toward food (+1.1), survival bonus (+0.1 per step), collision (-10)
Writing maintainable RL code with type hints, modular structure, and model persistence

Next steps

Compare against DQN-based agents (Double/Dueling DQN) for scalability
Test with larger boards and richer state representations
Curriculum learning to gradually increase difficulty

Tech

Python • NumPy • Reinforcement Learning • Q-Learning

Q-Learning Snake AI performance snapshot

Performance snapshot showing the agent's achievements and key metrics.

Back to Projects