All Deep Analysis Foresight AI News Open Source AI Products Research Papers AI Security AI Practices AI Skills AI Overseas

Deep Analysis · 5 min read · 1h ago

The Man $2.7B Couldn't Keep: Why Noam Shazeer Left Google for OpenAI

The Transformer co-author just joined OpenAI as head of architecture research. This isn't a hire. It's a declaration of war.

Noam Shazeer posted 16 words on X.

"I'm leaving Google and joining OpenAI."

No thanks to his former employer. No sentimental farewell. No mention of Character.AI, Gemini, or the team he led.

Just gone.

Below the post, Sam Altman replied instantly: "Noam has been at the top of my list of people I've wanted to work with since OpenAI's founding. Worth the 10-year wait."

Altman wasn't being polite.

When OpenAI launched in 2015, Shazeer was already one of the earliest Google engineers focused on AI. He joined Google in 2000 — employee number in the low hundreds. His mentor? Jeff Dean.

In 2017, he co-authored a paper with seven others: Attention Is All You Need.

That paper defined the Transformer. And the Transformer defined the entire AI industry. GPT, Claude, Gemini — every major model traces its lineage back to it.

One of the "Transformer Eight." That title alone gets you a blank check at any AI lab.

But Shazeer's story is far more complicated than "Transformer father defects."

The Man Google Couldn't Hold

This is Shazeer's third departure from Google.

Joined in 2000, left in 2009. Returned to Google Brain in 2012, left again in 2021. Came back in 2024, now gone in 2026.

He joked on a podcast: "I seem to re-join Google every 12 years."

Behind the joke is a recurring pattern: Shazeer saw the future inside Google. Google just wouldn't let him build it.

The 2021 exit was triggered by a chatbot called Meena. Shazeer and colleague Daniel De Freitas built it — a conversational AI that could chat naturally about almost anything. Shazeer wrote an internal memo titled Meena Eats the World, predicting the chatbot could replace Google Search and generate trillions in revenue.

Google didn't launch it. Executives cited safety and fairness risks.

For Google, that was caution. For Shazeer, it was a massive opportunity shelved — and shelved opportunities, in AI, are usually missed ones.

So he left. He and De Freitas founded Character.AI.

One year later, ChatGPT proved Shazeer right. The world realized that chatbots were the default interface to AI.

Character.AI took off. In March 2023, it raised $150M at a $1B valuation.

But startups are hard. Burn rate was high. Revenue model was fuzzy. Users flooded in for romantic role-play — not what the founders had envisioned.

Then, in 2024, Google made its move.

A roughly $2.7 billion deal: Google got a license to Character.AI's technology and brought Shazeer, De Freitas, and part of the team back to Google DeepMind. Shazeer owned 30-40% of Character.AI, netting him an estimated $750M to $1B personally.

Google spent $2.7 billion to win back the one who got away.

Inside Google, morale soared. One employee compared it to "witnessing the resurrection of Jesus."

Then what?

Less than two years later. Shazeer left again.

This time, he went to the company that built ChatGPT.

What OpenAI Is Really After

This isn't an ordinary hire.

Shazeer's new title at OpenAI: Head of Architecture Research.

Focus on "architecture." Not "make Transformer bigger." Not "continue scaling." Find what comes after the Transformer.

Over the past two years, the industry has come to a sobering realization: the marginal returns on scaling pre-training are diminishing. Ilya Sutskever said it publicly — pre-training, the single most important scaling recipe of the past decade, is approaching its limits. Making a model 100x larger won't automatically produce another GPT-3-to-GPT-4 leap.

The Transformer itself is showing cracks.

Google DeepMind published a paper this year called The Topological Trouble With Transformers, arguing that pure feed-forward Transformers have a structural weakness in dynamic state tracking. Models are great at "looking back" at context, but bad at maintaining an evolving internal state.

In plain English: the Transformer is like a very thick notebook. Every time the model needs context, it has to flip back through the pages. It doesn't truly remember.

Long context isn't real memory. Chain-of-thought isn't real reasoning.

That's why the industry is searching for the next architecture. MoE, state-space models, recurrent structures, latent reasoning, test-time compute — every direction is being explored.

Shazeer joining OpenAI at this moment sends a clear signal: a man who helped define the Transformer era is now leaving to define whatever comes after it.

The Talent War Just Escalated

Zoom out. Shazeer's move is a single node in a much larger conflict.

June 2026. The AI talent market is on fire:

OpenAI poached Shazeer — and Gemini's co-lead — from Google
Anthropic hired Andrej Karpathy (OpenAI co-founder) to lead Claude pretraining
Anthropic brought in ex-Microsoft Azure AI exec Eric Boyd for infrastructure
Barret Zoph rejoined OpenAI — and left again 5 months later
Elon Musk sued OpenAI for stealing trade secrets. The court dismissed it. With prejudice.

That's not all.

On the same day Shazeer announced his move, both Anthropic and OpenAI filed for IPOs. Nearly simultaneously, both companies are taking themselves public.

In the pre-IPO window, talent is the single most important currency.

Who hires the best people wins the next generation of model competition. Who finds the post-Transformer architecture breaks the cost curve and the capability ceiling.

Shazeer isn't the first to be fought over at this scale. He won't be the last.

But he is the most unusual.

Eric Schmidt recalled at a 2015 Stanford talk: Shazeer asked him for access to thousands of compute chips, saying, "I'm going to solve general knowledge by this weekend."

That attempt failed. But Schmidt said: "If there's anyone in the world who might actually pull this off, he's the one I'd bet on."

Ten years later, Shazeer joined the company closest to that goal.

This time, Google's $2.7 billion wasn't enough.

June 2026. Model competition has become talent competition. Talent competition has become architecture competition. And architecture competition will decide the endgame of this AI cycle.

Transformer之父离开谷歌加盟OpenAI，背后是一场正在白热化的AI人才战争。

Noam Shazeer发了一条X，16个字。

"I'm leaving Google and joining OpenAI."

没有感谢前东家。没有煽情回忆。没有提Character.AI，没有提Gemini，没有提他带的团队。

就是走了。

这条X下面，Sam Altman秒回："从OpenAI创立第一天，Noam就是我最想合作的人。等了十年，值得。"

Altman这句话不是客套。

2015年OpenAI成立时，Shazeer已经是Google最早关注AI的那批人。他在2000年加入Google，是最早的几百号员工之一。他的导师是Jeff Dean。

2017年，他和另外七个人发表了一篇论文：《Attention Is All You Need》。

这篇论文定义了Transformer。而Transformer，定义了今天整个AI行业。从GPT到Claude，从Gemini到几乎所有主流大模型，底层结构都绕不开它。

Transformer八子之一。这个头衔足以让任何AI公司为他腾出一个位置。

但Shazeer的故事，远比"Transformer之父"复杂。

Google留不住的人

这是Shazeer第三次离开Google。

2000年加入，2009年离开。2012年回归Google Brain，2021年又离开。2024年回归，2026年再次离开。

他自己在播客里开玩笑，"我似乎每隔12年就会重新加入一次Google。"

但玩笑背后是一个反复出现的模式：Shazeer在Google看到了未来，但Google没有让他把未来做出来。

2021年那次离开，导火索是一款叫Meena的聊天机器人。Shazeer和同事Daniel De Freitas开发了它，它能围绕各种话题自然对话。Shazeer在一份内部备忘录《Meena Eats the World》中预测：这款聊天机器人有可能取代Google搜索，创造数万亿美元收入。

Google没有发布它。高管给出的理由是安全性和公平性风险。

对Google来说，这是谨慎。对Shazeer来说，这是一个巨大机会被放下——而机会被放下，往往意味着被错过。

所以他走了。和De Freitas一起创办了Character.AI。

一年后，ChatGPT证明了Shazeer的判断。全世界意识到，聊天机器人就是普通人接触AI的第一入口。

Character.AI迅速起势。2023年3月完成1.5亿美元融资，估值10亿美元。

但创业不易。烧钱太快，盈利模式模糊。用户大量涌入浪漫角色扮演场景，偏离了创始团队的初衷。

2024年，Google出手了。

一笔约27亿美元的交易：Google获得Character.AI技术授权，把Shazeer、De Freitas和部分团队带回Google DeepMind。Shazeer持有Character.AI 30-40%股份，个人收益7.5亿到10亿美元。

Google用27亿美元追回了曾经流失的人。

消息传出时，Google内部士气大振。有员工形容，这就像"见证耶稣复活"。

然后呢？

不到两年。Shazeer又走了。

这一次，他去的是做出了ChatGPT的OpenAI。

OpenAI在抢什么

这不是一次普通的人事变动。

Shazeer在OpenAI的新头衔是架构研究负责人。

注意这个词："架构研究"。不是"继续强化Transformer"，而是"寻找Transformer之后的东西"。

过去两年，AI行业越来越清楚地意识到一件事：单纯扩大预训练规模，边际收益在下降。Ilya Sutskever公开说过——预训练作为最重要的scaling配方，正在接近边界。把模型再放大100倍，不会自动带来下一次GPT-3到GPT-4式的跨越。

Transformer本身也开始暴露短板。

Google DeepMind今年发了一篇论文叫《The Topological Trouble With Transformers》，指出纯前馈Transformer在动态状态追踪上存在结构性缺陷。模型很擅长"回头看"上下文，却不善于维护一个持续更新的内部状态。

翻译成大白话：Transformer像一本很厚的笔记，模型每次都要翻回去查，而不是真正"记住"了什么。

长上下文不等于真正记忆。思维链不等于真正推理。

所以行业在找下一代架构。MoE、state-space model、递归结构、latent reasoning、test-time compute——各种方向都在试。

Shazeer在这个时间点加入OpenAI，意义很明确：一个定义了Transformer时代的人，现在要去定义Transformer之后的时代了。

人才战争的全面升级

把镜头拉远，Shazeer的跳槽只是一场更大战争的一个节点。

2026年6月，AI行业的人才流动已经白热化：

OpenAI挖走Shazeer，同时还从Google挖走了Gemini联合负责人
Anthropic请来Andrej Karpathy（OpenAI联合创始人），负责Claude预训练
Anthropic挖来前Microsoft Azure AI高管Eric Boyd，负责基础设施
Barret Zoph重返OpenAI仅5个月后再次离职
Elon Musk诉OpenAI窃取商业机密，被法院驳回

这还不是全部。

就在Shazeer宣布加入OpenAI的同一天，Anthropic和OpenAI先后提交了IPO文件。两家公司几乎在同一时间把自己推向公开市场。

上市前的窗口期，人才就是最重要的筹码。

谁能招到最好的人，谁就能在下一代模型竞争中跑在前面。谁能找到Transformer之后的架构，谁就能打破现有的成本结构和能力天花板。

Shazeer不是第一个被高价争夺的人，也不会是最后一个。

但他是最特殊的一个。

Schmidt在2015年的一次斯坦福演讲中回忆：Shazeer曾向他要数千颗计算芯片的使用权限，说"我要在这个周末之前解决通用知识问题"。

那次尝试失败了。但Schmidt说："如果说世界上有谁最有可能做到这件事，我能想到的就是他。"

十年后，Shazeer加入了一家和那个目标最接近的公司。

这一次，Google花27亿美元也没能留住他。

2026年6月，AI行业进入了一个新的阶段。模型竞争正在变成人才竞争，人才竞争正在变成架构竞争。而架构竞争，最终将决定这轮AI浪潮的终局走向。

← Deep Analysis