This layer duplication strikes me as a bit of "poor man's" version of looped language...

naasking • today at 3:10 PM • 1 reply • view on HN

This layer duplication strikes me as a bit of "poor man's" version of looped language models:

https://ouro-llm.github.io/

Pretty cool though. LLM brain surgery.

Replies

Agrees, but one thing to note:

I really think from the experiments that 'organs' (not sure what to term this), develop during massive pretraining. This also means maybe looping the entire models is actually not efficient. Maybe a better way is [linear input section -> loop 1 -> linear section -> loop 2 -> linear section -> ... -> loop n -> linear output]?

This would give 'organs' space to develop.

➕ show 1 reply

alt Hacker News

Replies