博弈论GAME THEORY

井字棋:一个被人类彻底解决的游戏Tic-Tac-Toe: A Game Humanity Completely Solved

井字棋大概是每个人小时候都玩过的游戏:3×3 的格子,画圈打叉,三连成线。它简单到几乎不值一提——可正是这份简单,让它成为博弈论的完美入门范例,也是计算机科学课堂上最经典的算法练习。

一个"被解决"的游戏

在博弈论里,一个游戏如果可以从数学上算出在双方都完美发挥时的必然结果,就被称为"已解决(solved)"的游戏。井字棋就是被彻底解决的:只要双方都不犯错,结局永远是平局。

这意味着,在"困难"难度下,你最好的结果就是逼平电脑——你不可能赢。这不是程序耍赖,而是这个游戏的数学本质。

到底有多少种棋局?

井字棋的状态空间小到可以完全穷举。如果不考虑对称性,按落子顺序计算,总共有 255,168 种不同的完整对局。其中先手(X)获胜的有约 13 万局,后手(O)获胜的约 7.7 万局,平局约 4.6 万局。考虑到棋盘的旋转与镜像对称后,本质上不同的局面只有几百种——小到一个人用纸笔就能分析完。

minimax:让电脑永不犯错

驱动"困难"难度电脑的,是一个叫 minimax(极小化极大)的经典算法。它的思路非常符合直觉:

  • 电脑假设自己会选择对自己最有利(得分最大)的走法;
  • 同时假设你会选择对它最不利(得分最小)的走法;
  • 它递归地把游戏一直推演到结束——赢记 +10,输记 −10,平记 0;
  • 然后回溯,在每一步都选择"假设对手最优应对下,自己仍最好"的那一步。

因为井字棋的局面如此之少,minimax 可以瞬间把整棵博弈树算到底,因此它永远不会失误。这也是为什么它"不可战胜"。

minimax 的美妙之处在于:它不靠经验或直觉,而是把"假设对手是完美的"这一悲观假设,变成了永不犯错的策略。

从井字棋到 AlphaGo

同样的核心思想——推演未来、假设对手最优、选择对自己最有利的一步——一路延伸到了国际象棋和围棋。区别只在于:象棋和围棋的局面多到天文数字,无法穷举,于是人们引入了"剪枝"(α-β 剪枝)、启发式评估,最终发展出深度学习与蒙特卡洛树搜索,才有了击败人类冠军的 AlphaGo。

所以,当你在和我们的"困难"电脑较量时,你其实是在与现代博弈 AI 的最朴素祖先过招。试试看你能不能每一局都逼平它——能做到,你就已经摸到博弈论的门道了。

Tic-Tac-Toe is the game almost everyone played as a child: a 3×3 grid, X's and O's, three in a row. It is so simple it seems hardly worth mentioning — yet that very simplicity makes it a perfect first example of game theory and one of the most classic algorithm exercises in computer science.

A "solved" game

In game theory, a game is "solved" if its outcome under perfect play by both sides can be computed mathematically. Tic-Tac-Toe is completely solved: if neither player blunders, the result is always a draw.

That means on "hard" difficulty, the best you can do is force a draw — you cannot win. The computer isn't cheating; this is the mathematical nature of the game.

How many games are there?

The state space is small enough to enumerate entirely. Ignoring symmetry and counting by move order, there are exactly 255,168 distinct complete games. Of these, the first player (X) wins about 131,000, the second player (O) wins about 77,000, and roughly 46,000 are draws. Once you fold in the board's rotations and mirror reflections, only a few hundred essentially different positions remain — few enough to analyse by hand.

Minimax: making the computer flawless

The "hard" computer is driven by a classic algorithm called minimax. Its logic is intuitive:

  • The computer assumes it will pick the move best for itself (maximum score);
  • and that you will pick the move worst for it (minimum score);
  • it recurses all the way to the end of the game — a win scores +10, a loss −10, a draw 0;
  • then it backs up, choosing at each step the move that is best for itself assuming you respond optimally.

Because Tic-Tac-Toe has so few positions, minimax can compute the entire game tree to the very end instantly, so it never errs. That is why it is unbeatable.

The beauty of minimax: it relies on no experience or intuition. It turns one pessimistic assumption — that the opponent is perfect — into a strategy that never makes a mistake.

From Tic-Tac-Toe to AlphaGo

The same core idea — look ahead, assume the opponent plays optimally, pick the move best for you — reaches all the way to chess and Go. The only difference is scale: chess and Go have astronomically many positions that cannot be enumerated, so people added "pruning" (alpha-beta), heuristic evaluation, and eventually deep learning with Monte Carlo tree search — the lineage that produced the human-champion-beating AlphaGo.

So when you spar with our "hard" computer, you are facing the humblest ancestor of modern game AI. See if you can force a draw every single time — if you can, you have already grasped the heart of game theory.

想亲手试试?Want to try it yourself?

挑战不可战胜的 minimax 电脑,看你能否逼平。Take on the unbeatable minimax computer — can you force a draw?

立即开玩Play now
昼夜工坊
昼夜工坊编辑部 · Day & Night Studio
我们自己设计游戏,也自己写关于游戏的一切。We make our own games — and write everything about them ourselves.