Noam Shazeer was one of the lead authors of the seminal paper "Attention Is All You Need", which introduced the transformer architecture. (From Wikipedia)
Some context for people who haven’t followed the full loop: Shazeer was a long-time Google researcher, joined Google in 2000, and was one of the co-authors of “Attention Is All You Need.”
He left Google in 2021 to co-found Character.AI. In 2024, Google brought him and some Character.AI researchers back via a licensing/talent deal with Character.AI (reportedly around $2.7B). He was then made a Gemini co-lead.
Now he’s leaving Google again for OpenAI.
Exciting times!
Wow. What could possibly have caused him to quit so soon after coming back?
I hope this is not accurate but I'm afraid it is: https://x.com/signulll/status/2067446889956430273
https://old.reddit.com/r/singularity/comments/1u8xc9m/most_l...
Seems like there are some insights here!
edit: it seems the post has been removed but comments are viewable.
1 liner summary:
To put it lightly, the dude was politically outspoken and held strong beliefs.
[Edit: note that my comment was reparented, it was originally a response to someone claiming Noam was another "Scam Altman". I don't mind the reparenting or the killing of the original subthread, but I feel like this is necessary context to understand this.]
Noam is the real deal, he was pretty legendary within old-time ('00s) Google engineering. Paul Buchheit had a story about interviewing him with the "how to write a spellchecker" question and then him coming up with something better than the state-of-the-art, then basically delivering Google's spell corrector in his first 2-week Noogler project.
AI hiring starting to look like sports free agency.
Karpathy to Anthropic, now Noam to OpenAI.
Question one: How much did this cost OpenAI?
Question two: Why are OpenAI spending that money taking talent from Google, who can definitely outspend them for talent, and not Anthropic, who are leading the market and are at least somewhat financially constrained.
Very bad news for Gemini - the brief comeback with 2.5 Pro last year looked to be driven by Noam
Wow - Google paid a couple billion dollars to bring Noam back. Really impressive by OAI if this reporting is accurate!
This does suck for Google. Noam will take a lot of Google trade secrets with him to OpenAi. Google's bench is deeper than this one guy though.
Surprised to not see more comments on this, especially given the popularity of the Anthropic/Karpathy article. What a win for OpenAI - and what a loss for Google, just 2 years after paying $2.7bn to bring Noam back into the fold. Does not bode well for Gemini long-term... Or could be a signal for how deeply they are leaning into world models.
I hope this doesn't impact Google's progress on open models.
From the excited comments and fanboyism, I have to say KRAZAM predicted the cult of personality that has infected the AI space.
Oh! Big deal. Exciting.
Looks like Google is leaking both AI talent and know-how something fierce ... and since the very day the transformer paper was written.
As an outsider, I'd be really curious to understand why, given how well positioned they seem to be in the AI battle:
- huge, quasi unmatched data war chest
- huge, quasi unmatched, planet-scale infrastructure
- native AI chip design and production (TPU)
- the core ideas for what we now know as "AI" were invented there
- deepmind, enough said
- pretty much the deepest pocket of all the AI players with the possible exception of MSFT
- a massively large user base and reach to deploy AI to (Android, YT, Cloud, Search, Email, ...)
- supposedly one the best engineering culture of the valley
Why do the best people leave ?
Why do their AI product always come in 3rd place ?
Why can't they seem to take the lead, both in terms of product design or in term of raw LLM performance?
The only answer I can think of is:
- culture is completely broken
- management sucks something fierce
- company is so fat and rich no one is actually interested in winning anymore
Its getting pretty lame that we talk about the these guys like they're football players transferring teams.
This is what you call a PR hire.
Good luck Noam, Gemini is a great piece of work.
I guess this means Google is nowhere close, to even discern a hint of an AGI? So when Demis Hassabis says AGI...could arrive in just 3 years he has learned the best from Larry Ellison?
Niceee
Silver lining: given the leaked financials of OpenAI, he might very well be joining a sinking ship.
Also, why didn't they nail him down contractually when they bought character.ai ... isn't that pretty standard with these type of superstar (re)hires?
Huge blow to Google.
I doubt that the money had anything to do with it.
I also doubt that the state of the technology at OAI vs. Google had much to do with it, Google is behind no doubt, but the gap is not as far as we know, insurmountable.
I suspect that this is a leadership clash. Noam was working in GDM. GDM somehow went away from coding and RSI into "world models" and that has played out very poorly. Who made that call? Who was still playing politics?
Given this is Noam the list of people that could be pissing him off is very small: Demis, Sergey (?!), a couple of VPs in GDM.
What the hell happened?
Tell me open ai are in emergency mode without telling me they are in emergency mode
[dead]
[dead]
[dead]
[dead]
[flagged]
For those interested, Wired ran a backstory about the Attention is All You Need paper 2 years ago: https://www.wired.com/story/eight-google-employees-invented-...
It gives some context on the contributions of each of the authors. About Shazeer, from the article:
Shazeer’s joining the group was critical. “These theoretical or intuitive mechanisms, like self-attention, always require very careful implementation, often by a small number of experienced ‘magicians,’ to even show any signs of life,” says Uszkoreit. Shazeer began to work his sorcery right away. He decided to write his own version of the transformer team’s code. “I took the basic idea and made the thing up myself,” he says. Occasionally he asked Kaiser questions, but mostly, he says, he “just acted on it for a while and came back and said, ‘Look, it works.’” Using what team members would later describe with words like “magic” and “alchemy” and “bells and whistles,” he had taken the system to a new level.