Reinforcement Learning Snake Game

Reinforcement learning snake game demonstration
Snake AI gameplay demonstration showing learned behavior

Live gameplay demonstration of the trained Q-Learning agent navigating the board.

About the Project

View Code on GitHub

A reinforcement learning agent that learns to play Snake using tabular Q-Learning in a custom Python environment. The agent improves purely through interaction with the game—no scripted strategy—learning a policy that balances reaching food with avoiding collisions.

Highlights

  • Custom Snake environment built for RL-style training and evaluation
  • Tabular Q-Learning with epsilon-greedy exploration and decay
  • Compact state encoding (8-bit binary, 256 discrete states) to keep learning feasible and fast
  • Reward design to speed up learning and reduce random wandering early on
  • Reproducible experiments with seeded runs and logged results
  • Visual outputs: training curves and gameplay GIFs

How it works

At each step, the agent observes a compact representation of the board (food direction + immediate collision risks), selects an action, receives a reward signal, and updates its Q-table via the Bellman update:

Q(s,a) ← Q(s,a) + α(r + γ max Q(s',a') − Q(s,a))

Key hyperparameters: Learning rate α = 0.1, discount factor γ = 0.9, epsilon decay = 0.9995 per episode (0.2 → 0.05)

Performance: Best average score of 33.20, highest single score of 62 (24.2% board coverage) with ~15,700 training episodes.

What I learned

  • MDPs and value functions, exploration vs exploitation trade-offs
  • State design to prevent state-space explosion (8-bit encoding reduces 2^32 potential states to just 256)
  • Reward shaping for faster convergence—sparse rewards slow learning, so designed multi-tier structure: food eating (+10 + length×0.5), moving toward food (+1.1), survival bonus (+0.1 per step), collision (-10)
  • Writing maintainable RL code with type hints, modular structure, and model persistence

Next steps

  • Compare against DQN-based agents (Double/Dueling DQN) for scalability
  • Test with larger boards and richer state representations
  • Curriculum learning to gradually increase difficulty

Tech

Python • NumPy • Reinforcement Learning • Q-Learning

Q-Learning Snake AI performance snapshot

Performance snapshot showing the agent's achievements and key metrics.

Back to Projects