logoalt Hacker News

loveparadetoday at 6:25 AM1 replyview on HN

I doubt there is anything special about the transformer code the frontier labs use. The only thing proprietary in it are probably the infrastructure-specific optimizations for very large scale distributed training and some GPU kernel tricks. The real moat is the training data, especially the RLHF/finetuning data and verifiable reward environments, and the GPU clusters of course.

The open source models are quite close, and they'd probably be just as good with the equivalent amount of compute/data the frontier labs have access to.


Replies

dgb23today at 7:36 AM

That’s what I‘m thinking as well.

However, I assume that usage data could be increasingly valuable as well. That will likely help the big commercial cloud models to maintain a head start for general use.