> I replicated David Ng's RYS method [...] found something I didn't expect. > Tran...

simgt • today at 9:59 AM • 1 reply • view on HN

> I replicated David Ng's RYS method [...] found something I didn't expect.

> Transformers appear to have discrete "reasoning circuits" — contiguous blocks of 3-4 layers that act as indivisible cognitive units. Duplicate the right block and the model runs its reasoning pipeline twice. No weights change. No training. The model just thinks longer.

How did you not expect that if you read his post? That's literally what he discovered, two years ago.

For anyone interested, there's more meat in the post and comments from last week: https://news.ycombinator.com/item?id=47322887

Replies

regularfry • today at 10:50 AM

That's explicitly not the unexpected part. Read the rest of the post.

➕ show 1 reply

alt Hacker News

Replies