logoalt Hacker News

vibe42today at 4:22 PM1 replyview on HN

Perhaps not widely known but certainly known in LLM research. There was a bunch of these experiments done 2 years ago and what's interesting is that it still seems to work on the latest models.

Though beware that the increased score on math and EQ could lead to other areas scoring less well; would love to see how these models score on all open benchmarks.


Replies

v9vtoday at 4:57 PM

The author claimed that the models he modified with this layer repetition method topped the huggingface open llm leaderboard in his first post: https://dnhkng.github.io/posts/rys/

Do you remember the names of the previous experiments done on this? Would love to take a look.

show 1 reply