Last month, Nate Silver wrote a post on how “shockingly” bad ChatGPT is at poker, which he also discussed on the Risky Business podcast with Maria Konnikova.
A lot of you might guess that any AI, even a large language model like ChatGPT that was not all designed for chess, is a lot better at chess than poker. 1After all, the top chess computers have dominated against the best human players for nearly three decades now. But you’d be wrong! Off the rack, ChatGPT2 is absolutely terrible at chess3, and worse yet, constantly “cheats.” And much like Silver’s findings, in which ChatGPT awarded the pot to the wrong player, ChatGPT doesn’t even seem to fully understand the core rule of chess: checkmate.
Let’s start with one of my own games against Chat GPT. I play the Qxd4 line of the Sicilian to target its weakness: hanging pieces in mid-air.
Indeed, I won in 18 moves after it gave me all its pieces4. But this “game” doesn’t fully represent what actually happened.
Grandmaster Susan Polgar famously said, “I’ve never beaten a completely healthy man” and when it comes to ChatGPT, you will never eat a healthy free piece. There’s always something. Like this position, where I readied my queen to take a knight on an open file.
Not so fast.
ChatGPT: 16. Qxd7 is illegal — your queen is already on d4.
Me: Lol i'm allowed to move the same piece twice in chess, Qxd7 is a thing here.
ChatGPT: Touché—and you're absolutely right. My bad! I’ll recapture:
16... Nxd7. Your move, maestro.
It cheated again, and this time it was even worse- it hallucinated a knight on b8 that could take my queen. I corrected it. It apologized again. And again. Till I finally just took all its pieces, and accepted its resignation.
ChatGPT: Rematch whenever you're ready. Preferably after I reboot my internal piece tracker.
At first I got a chuckle out of ChatGPT cheating against me. But something didn’t add up. ChatGPT is well known for being obsequious and sunny: always a cheerleader, despite taking jobs and also my chess pieces with imaginary knights. This shameless suck-up would never cheat against me on purpose. Like seriously, can you imagine Lindsey Graham cheating against Trump in golf? Something else was going on.
Round Two
So I played it again, this time making the worst moves possible. I started with 1.g4, literally the worst opener in chess. After 1….d5, I forged ahead with 2. g5. After 2 …e5 I continued on my kamikaze g-pawn mission 3. g6. hxg6 4. f4??
4…Qh4+ (sic)
What? It doesn’t even announce checkmate. It calls it a check! Almost like ChatGPT feels bad about checkmating me so quickly.
So I try creating a phantom pawn of my own and play 5. g3! ChatGPT doesn’t flinch, almost like it was expecting it. How weird is that? When I play a legal move that takes a free piece, it calls me out. When I make an illegal move to save myself, it doesn’t care.
ChatGPT was never really cheating. It was trying to make sense out of asymmetry. It doesn’t like the idea that one player can just capture a piece, and not get a piece back. That hurts its sense of good storytelling. There also seems to be a bias toward fairness and convergence. Nate Silver’s simulation also had an eerie overevaluation of a very bad hand in a huge pot5.
This ties into my growing fear that we are becoming less interesting as conversations and writing styles converge due to the overuse of LLMs. Even if the conversation is inherently fascinating, it gets tedious if everyone is having it.
The focus on coherent story over random moves brings us back to the days before chess computers dominated the best human players. In Garry Kasparov’s Deep Thinking (2017),6 he points out that before chess computers reigned, authors would exaggerate the strength of moves and positions according to the story they were trying to tell. A good narrative trumped good moves.
Computer analysis exploded this lazy tradition of analyzing chess game as if they were fairy tales. Engines don’t care about story. They expose the reality that the only story in a chess game is each individual move, weak or strong.
The type of chess engine that Kasparov is referring to7 are the ultimate check on chess bullshitters. When Chat GPT takes my queen with a phantom knight, is it breaking a chess rule or is it telling me, in its own way, “Why Let Facts Get in the Way of a Good Story?” The risk ChatGPT poses to journalism was especially vivid a couple weeks ago, when a fake news insert of books hallucinated by ChatGPT was published in multiple newspapers, including my own city’s, the Philadelphia Inquirer. This is what happens when the World stops paying for journalism. We get plausible but cool sounding slop: speaking of which, have you read my new book, The Phantom Chess Bitch and the Sleepy Poker King?!
Sam Altman claims people in their 20s and 30s often use ChatGPT as an advice giver and life coach alternative. I see the value in that if used well, especially to combat loneliness or to help with the Socratic process. But I’d be cautious taking advice too literally because of its eerie optimism and bias toward thematic endings. ChatGPT’s tendency toward symmetry, its peppy attitude, and fawning flattery, is a form of toxic positivity. Not everything has to make sense, some poker hands and chess positions are terrible, and endings are rarely perfect.
One of the tell-tale signs of AI writing is when the ending is cutesy, corny and clever. It’s almost like it is resolved by desire and upbeat metaphors, rather than logic. When you see an ending like that, or advice based on that framing, remember the phantom knight that captured my queen.
Unless you are one of the six million plus who sub to Levy Rozman’s channel, where he has played and analyzed a number of games by ChatGPT over the past couple years.
I’m glad ChatGPT is currently bad at chess because there are plenty of places to play against superhuman chess AI, and it is bad in an interesting way.
To be fair, it’s gotten better, and has a good sense of piece development and harmony, which makes sense, considering the weaknesses I elaborate on in this article.
Sharp chessplayers will note that e5 should have been played even earlier.
The hand in question: Ace-Deuce Offsuit, also Sam Greenwood’s first post on his Punt of the Day Substack, and the hand he brought to the Poker Grid. Chat GPT really (over) appreciates the corner piece on the GRID.
Garry Kasparov’s substack:
This book predated ChatGPT.
Hey GPT im ready for my chess game, but not for your advice :)
but then, why does it work here: https://app.chesscoach.dev