Chess in that almost alien way
On a long bus ride with some friends recently, I found one was playing chess on his iPad. That is, he was playing against the little machine—or if you like, against the chess app on the little machine. I leaned over and asked: “Who’s winning?”
To my surprise, he said with a sheepish smile: “I am!” Then he explained. Every now and then, he was getting the program to suggest moves for him. Or he was taking back moves that he had made and reconsidered. Or he was asking the program to take back moves that he found particularly threatening. No wonder he was winning.
Still, you will appreciate that mine was a rhetorical question. I’ve played occasionally against these chess programs, and it isn’t easy at all. Compared to me, they play at warp speed and nearly every move asks tough questions of my limited chess abilities. Sometimes it feels like I’m battling an onrushing tide of chess prowess, coming at me relentlessly. So when I saw my friend playing, I expected that his iPad would win the game.
This has been a reasonable expectation for a couple of decades now, even if you match better players than my friend against better chess programs than the iPad’s. It was in 1997 that a chess computer, Deep Blue, first beat a world chess champion—Garry Kasparov—in a match. Since then, chess computers have only improved.
Ever since humans invented the computer, getting one to play chess well has been a sort of holy grail of computing. Here was a game with a set of rules that was relatively simple to instruct a computer about. At the same time, the rules and the board it is played on can produce an essentially infinite number of possible games, thus offering a level of complexity challenging enough to fuel research into artificial intelligence (AI). (To put that in perspective, consider what kind of triumph it would be to produce a program that wins at noughts-and-crosses). Naturally, the earliest chess computers were pushovers for even strictly amateur players—I remember how thrilled I was to win a game against one in the late 1980s. But they got better fast. If you look at so-called Elo ratings, which tell you how strong a given player is at the game, computers have scored higher than the best humans for years—higher, in fact, than the highest Elo rating, 2,882, any human has ever achieved. (Deep Blue itself was never rated, because IBM retired it before it played enough games against players who themselves had Elo ratings).
So how do these programs work? In effect, by looking several moves ahead and evaluating the positions that result. That’s what humans do too. The further ahead you can look on any given move, the more you are able to compare positions and decide which is better, the more likely you are to win the game. So when I’m thinking about my next move, I’m considering what my opponent might do in response, and how I might respond to that. That’s just two moves into the future, but I am rarely able to go further. This probably explains why I’m no more than your average chess dabbler. Yet, this short-sighted view is hardly surprising, because there are usually plenty of moves available, and plenty of responses to them, and plenty of responses to those—that is, the number of possible positions rises exponentially with each possible move, and it’s humanly impossible to evaluate each one. Even the best players can only look a few moves into the future. Even then, they can prune possibilities by relying on intuition and expertise about which moves to consider and which to ignore.
But a computer? In theory, it could look at every possible move from a given position, any number of steps into the future. All it needs is the time to do that. But a chess computer that takes several hours to find the best move is, after all, hardly a chess computer. So to play chess reasonably, computers needed to get fast enough not just to apply rules, look ahead and decide on the best move in a given position, but do all that in a reasonable time. It had more going for it, but that essentially described Deep Blue, the world-beater of 1997.
And yet, give this some thought. If a chess computer is merely applying rules and evaluating positions, albeit at phenomenal speeds, can we really say it is intelligent? Microsoft Excel can add up the numbers in a column much faster than you or I can. Can we really say Excel is intelligent? This is why AI researchers have never quite been persuaded of the worth of getting computers to play chess in this way. No doubt Deep Blue represented a triumph of programming and exploiting the power of computers, but calling that AI is a stretch. Why are we humans intelligent, after all?
Get this: because we can learn how to play chess. Learning, most of us would agree, is one of those fundamentals of intelligence. So can we produce a computer that can learn chess?
Well, just maybe we can. In early December, chess and AI circles began buzzing with news of the feats of AlphaZero, a chess-playing program developed by Google’s AI wing, DeepMind. And of AlphaZero, you could certainly say it had learnt the game.
More correctly, it learnt how to play the game well. For to begin with, it was given the bare-bones rules of chess—the way knights move, the way castling happens, etc.—and no more. In that limited sense, it knew how to play chess. But it had no notion of strategies that human players use, nor of the relative values of pieces that invariably figure in human assessments of a given situation in the game. Just the rules, and then AlphaZero set to work playing itself again and again. Not just once or twice, but some 40 million times. In the process, it taught itself how to play the game, better and better all the time. This is a process AI researchers call “reinforcement learning”.
The results were stunning. After just four hours of training itself this way under its belt, AlphaZero was sent into battle against a program called Stockfish. Now, Stockfish is the 2016 winner of the Top Chess Engine Championship, effectively the world championship of computer chess. Take Kasparov, Vishwanathan Anand, Bobby Fischer, Tigran Petrosian, Magnus Carlsen, any of the great champions of chess through history—Stockfish would handily beat every one of them. That’s how strong it is.
Stockfish and AlphaZero played 100 games. AlphaZero won 28 of those, 72 were drawn and Stockfish won precisely zero games. Stunning. Interestingly, 25 of those 28 wins came with AlphaZero playing white, which underlines what chess experts have always believed: there is an advantage that making the first move—which white does—gives you.
What’s more, AlphaZero played in a distinctly different style from either Stockfish or humans. As Demis Hassabis, CEO of DeepMind, remarked: “It doesn’t play like a human, and it doesn’t play like a program. It plays in a third, almost alien, way.” It played what one article described as an “all-out attacking style”, often apparently throwing out the window any caution a human might observe, sometimes making head-scratching but spectacular sacrifices of valuable pieces to gain an advantage in position. Strange, and yet the results spoke for themselves. If AlphaZero’s unorthodox style was going to win so overwhelmingly, perhaps there are lessons about how we have always played the game. Perhaps even lessons about how we learn. About intelligence itself.
It’s worth pointing out here that AlphaZero should not really be considered a chess program. It’s more like a learning machine. Given a basic set of rules, it can teach itself to play any other game too. In fact, a slightly different version of the program learnt to play Go, the Japanese board game that’s far more complicated than chess. Then it beat the program that had beaten the human Go world champion—not once or twice, but one hundred times in a row.
More generally, AlphaZero’s creators believe they can make their learning machine learn how to tackle tasks that have nothing to do with chess or other games. Finance, drug-testing and more — in other words, wherever it is possible to spell out some basic rules. Whether this is really correct, we’ll have to wait and see. But meanwhile, AlphaZero has shown that computers can learn. Maybe not in the way that we humans do, but learn nevertheless.
And that has some profound implications for AI. Maybe for intelligence itself.
Once a computer scientist, Dilip D’Souza now lives in Mumbai and writes for his dinners.
His Twitter handle is @DeathEndsFun