2022 年 1 月,一个叫 Wordle 的网页游戏在几乎没有任何营销的情况下,在两周内从几千日活用户涨到两百万。人们在推特上分享绿色和黄色的小方块,新闻网站争相报道"这是什么?",《纽约时报》花了七位数把它买下来。Wordle 引爆了一种新的网络文化现象:每天一道、全球同题、分享但不剧透的集体解谜游戏。
但 Wordle 的底层逻辑——通过颜色反馈逐步缩小候选词范围——其实并不新鲜。它的规则几乎原封不动地来自 1970 年代的老游戏"Mastermind",以及更早的纸笔猜词游戏。为什么这套机制在半个世纪后突然再度席卷全球?答案藏在语言学、心理学和信息论三个层面。
英文词汇的数学结构
猜词游戏的迷人之处之一在于:它把语言直觉变成了一道数学题。在一个包含约 12,000 个五字母英文单词的词库里,不同的字母出现频率差异巨大。E 出现在约 11% 的字母位置上,A 约 8.5%,R 约 7.5%;而 Q、X、Z 的出现频率不足 1%。这种分布并非随机——它反映的是整个英语的词汇演化历史。
正是这种分布差异,让"最优开局词"成为一个可以被严格分析的问题。一个好的开局词应该尽可能包含高频字母,同时让每个字母都贡献独立信息(即不重复)。根据信息论的分析,在所有五字母单词中,SOARE、CRANE、RAISE、STARE 等词被数学上证明是最优或接近最优的开局选择——它们平均能在第一次猜测后消去超过 70% 的候选词。
信息论:每次猜测值多少比特?
从信息论的角度看,猜词游戏是一道最优决策树问题。每次猜测的反馈(绿/黄/灰)将候选词库划分成若干子集,你的下一步应该选择能最大化"期望信息量"的那个词——即让候选集按概率最均匀地分裂,而不是把所有鸡蛋押在一个大概率结果上。
更具体地说:如果词库有 N 个候选词,你的理想目标是每次猜测平均消去尽可能多的候选词。数学上,每次猜测的最大可用信息量是 log₂(N) 比特(当 N=2315 时约为 11.2 比特)。顶级玩家平均 3.5 次猜出答案,相当于每次猜测贡献约 3.2 比特信息——已经相当高效,但离理论上限仍有空间。
这种分析思路本身就是一道有趣的数学练习。很多计算机科学和数学爱好者把"为 Wordle 设计最优策略"当作算法题,发表了数十篇博客和学术分析。Wordle 意外地成为了信息论的一个生动教具。
为什么"每天一道"如此上瘾?
Wordle 最聪明的设计决定,不是它的规则,而是它的稀缺性:每天只有一道谜题,所有人解同一个词,你必须等到明天才能再玩。这个设计创造了几个心理效应的叠加:
- 共同体验(Shared Experience):当你和朋友、同事都在解同一道题时,解题变成了一种社交货币——你可以分享结果、比较策略,而不必担心剧透对方的体验(因为大家解的是同一道)。这是 Wordle 的绿黄方块截图能在推特上疯传的核心原因。
- 可变奖励(Variable Reward):每道题的难度各有不同。某些天你两步就猜出来,某些天六步险险过关——这种不确定性和偶尔的"小胜利",触发了与赌博相似的多巴胺奖励机制。
- 完成度偏见(Completion Bias):大脑对"快完成的任务"有强烈的驱动力去完成它。每日一题形成了一种轻量级的"待办事项",让人每天早晨打开浏览器都有一个小目标。
猜词游戏和语言直觉
经常玩猜词游戏的人会逐渐变得对英文词汇分布更加敏感——即便他们没有意识到这一点。当你发现"这个词不大可能有两个 U"或"五字母词以 -TION 结尾的很少"时,你其实是在利用内化的词频知识进行推理。
这种"被动语言学习"效应在语言教育领域引发了一定关注。研究显示,英语学习者通过猜词游戏能显著提升对字母组合规律(Phonotactics)的感知——即哪些字母可以出现在单词的哪个位置。这是母语者凭直觉就知道、外语学习者通常需要大量阅读才能慢慢习得的知识。
如何提高猜词水平?
以下是几条有据可查的策略:
- 选择覆盖高频字母的开局词。CRANE、RAISE、STARE、AROSE 都是经过计算验证的高效开局词,第一步能消去大量候选词。
- 灰色字母立即从脑海中清除。已知一个字母不在词中,后续所有猜测必须完全排除它——这是最容易被初学者忽视的规则。
- 黄色字母换到不同位置。黄色告诉你字母存在但位置错误,下一步把它移到未尝试的位置。
- 不要浪费猜测在"探索"上。当候选词已经很少时,每次猜测都应该直接尝试候选答案,而不是继续引入新字母。
最终,猜词游戏的乐趣不只在于猜出答案,更在于那个用已知线索逐步收窄可能性的过程。它是逻辑推理和语言直觉的奇妙交汇——每天 5 分钟,足以让大脑清醒一整天。
In January 2022, a website called Wordle went from a few thousand daily players to two million in under two weeks — with no advertising, no app store, no social-media campaign. People shared green and yellow squares on Twitter. News outlets ran "what is this?" explainers. The New York Times paid a seven-figure sum to acquire it. Wordle ignited a new kind of internet phenomenon: one puzzle per day, the same word for everyone, shared without spoiling.
But the underlying mechanic — narrowing down a word through colour-coded feedback — is not new. The rules trace almost directly to the 1970s board game Mastermind, and before that to pencil-and-paper word games. Why did this mechanism go globally viral half a century later? The answer sits at the intersection of linguistics, psychology, and information theory.
The mathematical structure of English vocabulary
One of the delights of word-guessing games is that they turn linguistic intuition into a mathematical problem. In a corpus of ~12,000 five-letter English words, letter frequencies vary enormously. E appears in about 11% of letter positions, A in 8.5%, R in 7.5%; Q, X, and Z appear in under 1% each. This distribution is not random — it reflects the entire evolutionary history of the English lexicon.
That distribution makes "the best opening word" a question susceptible to rigorous analysis. A good opener should cover high-frequency letters and make each letter contribute independent information (no repeats). Information-theoretic analysis shows that SOARE, CRANE, RAISE, and STARE are among the mathematically optimal or near-optimal first guesses, each eliminating more than 70% of the candidate pool on average.
Information theory: how many bits is each guess worth?
From an information-theoretic perspective, word-guessing is an optimal decision-tree problem. Each guess's feedback (green/yellow/grey) partitions the candidate pool; your next move should maximise expected information gain — splitting the pool as uniformly as possible across outcomes, rather than betting everything on one high-probability branch.
Concretely: given N candidate words, the ideal goal is to maximise the average number of candidates eliminated per guess. The theoretical maximum per guess is log₂(N) bits (≈ 11.2 bits when N = 2315). Top players average 3.5 guesses per solution — roughly 3.2 bits of information per guess — impressively efficient but still below the theoretical ceiling.
This line of analysis has itself become a fascinating mathematical exercise. Dozens of bloggers, computer scientists, and researchers have published optimal-strategy analyses of Wordle, treating it as an algorithm design problem. Wordle unexpectedly became a vivid teaching example for information theory.
Why "one puzzle per day" is so addictive
Wordle's smartest design decision was not its rules but its scarcity: one puzzle per day, the same word for everyone, and you must wait until tomorrow to play again. This stacks several psychological effects:
- Shared experience: when you and your colleagues are solving the same puzzle, the result becomes social currency — shareable, comparable, discussion-worthy — without spoiling the experience (because you all faced the same challenge). This is why the green-yellow grid screenshots spread virally.
- Variable reward: difficulty varies from day to day. Sometimes you crack it in two guesses; sometimes you scrape through on the sixth. This variability and the occasional "small win" trigger dopamine reward patterns similar to those studied in gambling research.
- Completion bias: the brain has a strong drive to finish near-complete tasks. A daily puzzle creates a lightweight "to-do item" that gives every morning a small concrete goal.
Word games and linguistic intuition
Regular word-guessing game players gradually become more sensitive to English vocabulary distributions — often without being aware of it. When you notice "this word probably doesn't have two U's" or "five-letter words ending in -TION are rare," you are using internalised frequency knowledge.
This "passive language-learning" effect has attracted attention in language education. Research suggests that English learners who play word-guessing games show measurable gains in sensitivity to phonotactics — the rules governing which letter combinations can appear in which positions. This is knowledge that native speakers possess intuitively and that non-native learners typically absorb slowly through extensive reading.
How to improve at word-guessing games
- Open with a high-frequency-letter word. CRANE, RAISE, STARE, and AROSE are computationally verified to eliminate large fractions of the candidate pool in the first guess.
- Immediately purge grey letters. Once a letter is confirmed absent, never reuse it. This is the rule most beginners violate most often.
- Relocate yellow letters. Yellow confirms the letter exists but flags the wrong position — move it somewhere untried next time.
- Don't waste guesses on exploration. Once the candidate pool is small, guess directly from it rather than introducing new letters purely for elimination.
Ultimately, the pleasure of word-guessing games lies not just in finding the answer but in the process of narrowing possibilities from known constraints — logic and linguistic intuition meeting in a five-letter space. Five minutes a day, and your mind is sharper for the rest of it.
想亲手试试?Want to try it yourself?
用你的语言直觉和逻辑推理,猜出今日的词!Put your linguistic intuition and logical reasoning to the test — guess today's word!