logoalt Hacker News

xpcttoday at 1:38 AM1 replyview on HN

That would depend on what gets leaked, as I'm not so sure that the weights by themselves would be enough to replicate the architecture. I imagine some part of the secret sauce will remain in the architecture, and the tensor dimensions may not be enough to decode it.

I'm sure if proprietary models continue to be a big thing, the methodology of their storage and loading on hardware will be obfuscated quite a bit.


Replies

anonzzziestoday at 3:31 AM

But you can see this is not true (yet); competitors/Chinese labs are less than 6 months behind: either via leaks or by just stumbling on the same improvements with time/effort.

show 1 reply