Vibe Coding AI that learns to play Snake game

Today I’m Vibe Coding something and inviting you to follow along. This is not a very serious post. The purpose of this blog post is just to see how quickly we can build a NN that plays snake game and if we can improve it:

Steps we we will follow:

Generating simplest snake game you can play with WASD keys.
Generating AI that learns to play and watch it play.
Attempt to improve generated code so it reaches better scores.

Step 1: Generating simplest snake game you can play

Obviously, I need to write a prompt to generate such a game, but being lazy (and not knowing too much about IA) I offloaded prompt generation to GPT:

Generate a good prompt for copilot AI in Visual Code so it generates code for the snake game. Snake game can be the simplest possible terminal based game on a small field.

The output was pretty reasonable prompt, which I could have written (but, hey, that takes time). The only thing I updated in the prompt was the very last line to keep track of the score of the game:

# Write a simple snake game in Python that runs in the terminal.
# Requirements:
# - Keep the game as simple as possible.
# - Use a small fixed grid (e.g., 10x10).
# - The snake moves automatically in the last chosen direction.
# - Use WASD keys for movement (no fancy key handling needed, blocking input is fine).
# - Place food randomly; eating food makes the snake longer.
# - The game ends if the snake runs into itself or the walls.
# - Print the field after each move using simple ASCII characters:
#   - "." for empty space
#   - "O" for snake body
#   - "X" for snake head
#   - "*" for food
# - Keep the code in a single file, no external libraries beyond Python standard library.
# - Keep it short and readable.
# - Keep the score of the game. The score equals the total number of food eaten.

The generated code (with gpt-4o) was 73 lines of code and I could play the game in the terminal: https://github.com/andriybuday/snake-ia/blob/main/snake_game.py

Step 2: Generating AI that learns to play and watching it play

Again, prompt to get the prompt:

Now we need another prompt. This time we want to use pytorch and will be building a simple 2 hidden layers neural network with reinforcement learning. Use large punishment for loosing the game and small rewards for each eaten food. We want to achieve quick learning without too many iterations.

The prompt it generated this time was much more extensive. Here are all of the prompts: https://github.com/andriybuday/snake-ia/blob/main/README.md I then fed that prompt to both GPT-4o and Claude.

Claude generated a much better AI. GPT generated something that couldn’t even get more than one food score, which Claude was in the territory of 10-20 score. Note, that max theoretical score on 10×10 is 99. You can see above a gif showing last few epochs of training and game play of the Claude version.

The code for this version: https://github.com/andriybuday/snake-ia/blob/main/snake_game_ai_claude.py

Step 3: Improving AI so it reaches better scores

Ok, so what can be done to make this reach better scores? I asked GPT to recommend some improvements. It gave me general recommendations out of which I created a prompt for prompt:

Generate prompt I can give to Claude to improve performance of the Snake AI, potentially with these improvements: Change head to Dueling DQN, Add Double DQN target selection, Add PER (proportional, α=0.6, β anneal 0.4→1.0), Add 3-step returns, Add distance-delta shaping + starvation cap.

To be honest, at this point I don’t know if these improvements make sense or not, but I took the generated prompt and fed it to Claude. And what I got was broken code, which crashes on the “IndexError: Dimension out of range”. I was hoping to run into something like this. Finally. Now I can probably debug the problem and try to find where we are running out of range, but no, I’m sharing the error and stack trace to Claude again. It was able to fix it BUT things got worse, the snake would run into infinite loops.

Turns out generated “upgraded” version is much worse. So I decided to take a different path and get back to simple first version and see what can be updated. The only things I did were increasing training time (# episodes), allowing for more steps for training, and slightly decreasing time penalty. This is the change: https://github.com/andriybuday/snake-ia/commit/796ad35924700dcb73ac6aaecf8df39ec8069940

With the above changes the situation was much better but still not ideal.

Conclusion

Sorry for the abrupt ending, but I don’t really have time to fine-tune the generated NN or create new models to achieve the best results. The purpose here was to play and see what we can get really quickly. Also another purpose of this post is to show that people, like me in this case, who just do Vibe Coding without knowing underlaying fundamentals cannot really achieve best results really quickly. Happy Vibe Coding!

Try it yourself:

git clone https://github.com/andriybuday/snake-ia.git

cd snake-ia

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

python snake_game_ai_claude.py

..........
......OOO.
....OOOXOO
....OOOOOO
....O.....
..O.O.....
..OOO.....
..........
......*...
..........
Length: 21  Steps: 161
Game Over! Final Score: 8.9

Markdown	Result
text	text
text	text
*text*	text
`code`	`code`
~~~ more code ~~~~	more code
[Link](https://www.example.com)	Link
* Listitem	Listitem
> Quote	Quote