logoalt Hacker News

throw310822today at 4:35 PM1 replyview on HN

I keep thinking of the RYS (Repeat Yourself) experiment of simply looping some of the inner layers of LLMs for better results and wonder if any progress was made on it.

https://dnhkng.github.io/posts/rys/

Feels it should be straightforward to integrate in LLMs a network to control the looping. Or just duplicate entire blocks of layers after the initial training.


Replies

naaskingtoday at 6:58 PM

Yes, computing in latent space is a big thing now.

https://ouro-llm.github.io/