Introduction

Previously in my attempts to write a JavaScript game with AI, I was trying out Claude 3.5 Sonnet and I was impressed because it was definitely coding better than ChatGPT 4o. There are rumors that OpenAI will be releasing a new model next year, but the word in the industry is that newer models seem to be incremental improvements rather than seismic ones. I’m happy about that because I want to see more work leveraging what we have (including reducing energy consumption!) rather than the mad rush to build bigger and better nuclear-powered AI .

So yesterday, after a frustrating day of trying to make vector databases play better with Perl, I put that aside a bit and returned to the game. I previously hit my limit with Claude, but now I’m paying for it (along with ChatGPT and Gemini). I decided to make the game more useful (it’s still awful on mobile). Along the way, I took detours into ChatGPT and Gemini.

The game is here, if you want to skip the article.

The image displays a video game screen for "Level 3: Space Station," featuring a dark background with a grid of light squares. In the center, there are two colorful characters—one blue and one red—along with a yellow object at the bottom right, indicating the player's position or goal. — The Space Station Level

This Time, the Game is Fun!

I turned to Claude first. Claude has now become my go to assistant for coding, writing, and brainstorming. Unfortunately, it’s still a bit slow and if you have long sessions, it gets cranky and limits your access for a while. But still, when I can use it, it’s the best.

So, step-by-step, I started adding various features.

Select “Easy,” “Medium,” and “Hard” difficulty levels
Create themes for each level, such as “Forest,” or “Ice Cave”
One teleport per level, but unused teleports carry over
You win if you can reach level 6 (not easy on hard!)
Resize the game to fit your browser.

The game was kind of fun before, but now it’s a cross between a thinking game, and a logic puzzle. You can pause with the space bar and try to figure out if your reflexes are fast enough to sneak past the monsters. The different themes per level really make the game stand out. I’m enjoying it.

A pacman-like character with the eye on the rim of the mouth — The eyes were awful.

It was a pleasant hour or so, adding all of these features and tweaking the game play, but there was one thing I finally couldn’t stand any more. Claude had never been able to get the eyes right. I decided to tackle that.

It was a frustrating fifteen minutes, with eyes all over the place. I finally gave up and turned to ChatGPT. Wash, rinse, repeat. This, apparently, is a hard problem. My goal of writing no code wasn’t working. I was about to adjust the code, when I had a thought. A curious thought. One that rarely turns out well for me.

I turned to Google’s Gemini. I have a paid account there, too. Mainly this is for the one million token context window and the Google Docs integration. Those are Gemini’s saving grace. Nothing else is good. I knew Gemini would fail, but I gave it a shot. I uploaded the image, shared the HTML, and wrote the following prompt:

For the following game, you’ll notice that the eye of the main character is on the edge of the mouth. I would like it positioned more naturally in the head. I’ve tried with other LLMs and they either lose the pacman shape altogether, or they get the eye outside the head entirely.

Have I mentioned that Gemini is terrible? Yes, it’s terrible. So imagine my shock when I applied its suggestion and it worked! Unlike Claude and ChatGPT, it actually moved the eye closer inside the body. I was stunned. This was the first time that Gemini undeniably did something Claude and ChatGPT couldn’t. Fantastic!

But even though it correctly moved the eye, it was still touching the lip, even though it wasn’t overlapping it. So I issued another prompt:

That definitely looks better. Can you move it a bit further inside the body?

And Gemini dutifully replied :

I’m a text-based AI, and that is outside of my capabilities.

Have I mentioned that Gemini is terrible?

Fortunately, since its first attempt sort of worked, I manually adjusted the numbers to get the eyes inside the head. Finally.

Bad Code

Yes, Claude is still writing bad code. Here’s a great example:

if (maze[newY][newX] === 2) {
    currentLevel++;
    isPaused = true;
    if (currentLevel > 5) {
        alert('Congratulations! You won the game!');
        currentLevel = 1;
        teleports = 1;
    }
    initGame();
}

What is 2 above? Why is the maximum level of 6 hard-coded as > 5? It should look closer to something like this:

if (maze[newY][newX] === DOOR) {

    // You reached the exit, so let's
    // go to to the next level
    currentLevel++;
    isPaused = true;
    if (currentLevel >= MAX_LEVEL) {
        alert('Congratulations! You won the game!');
        currentLevel = 1;
        teleports    = 1;
    }
    initGame();
}

Time for another prompt:

In the game, we often see hardcoded values like this:

if (isValid(nx, ny) && maze[ny][nx] === 1) {

And this:

} else if (maze[y][x] === 2) {

Please extract all appropriate hard-coded values and replace them with named constants to make the code easier for a human to understand.

Claude did its thing. It generated tons of constants, documented those that might be confusing, failed to use most of them, and hallucinated a new function, initMonsters, and failed to call the function it hallucinated.

As I explain to managers and developers: AI are talented junior developers who never learn. They’re cheap, they’re fast, they never get tired, they never file HR complaints, and they can code in more languages than you’ve ever heard of. But they don’t code well and they don’t learn (short of fine-tuning your own model or using RAG).

So don’t worry yet, devs. Your jobs are still safe. For a while.

Conclusion

I’m looking forward to the next release of ChatGPT, but for now, Claude is still my first choice. Gemini is terrible. That’s a shame because Google clearly has the skills to do amazing things and I love some of the research papers they’re releasing around AIs, particularly their work in reducing hallucinations, but for now, if you’re not using Google Docs, it’s just not worth it.

Claude is king.

But ... ChatGPT does offer something which is a killer feature. They have a published OpenAPI spec that is regularly updated. Anthropic does not. There’s an unofficial spec and a Claude to OpenAPI API translator , but nothing from Anthropic.

I’ve seen this problem with vector databases, too. You can fire up ChromaDB and get its spec from http://localhost:8000/openapi.json, but I can’t find anything like that for Pinecone . To me, this is mystifying. Having such a spec makes it much easier to work with API-driven products. I have no idea why so many companies are making it harder to use their product. So kudos to OpenAI for this.

If you'd like top-notch consulting or training, email me and let's discuss how I can help you. Read my hire me page to learn more about my background.

Escape!-Adventures in AI Gaming

Introduction

This Time, the Game is Fun!

Bad Code

Conclusion