logoalt Hacker News

UltraSane12/07/20251 replyview on HN

Yes. The path dependence for current attention based LLMs is enormous.


Replies

patapong12/07/2025

At the same time, there is now a ton of data for training models to act as useful assistants, and benchmarks to compare different assistant models. The wide availability and ease of obtaining new RLHF training data will make it more feasible to build models on new architectures I think.