logoalt Hacker News

ttulyesterday at 2:40 PM2 repliesview on HN

Yes. How do they do it? Literally they must have PagerDuty set up to alert the team the second one of the labs releases anything.


Replies

beernetyesterday at 2:43 PM

They obviously collaborate with some of the labs prior to the official release date.

show 1 reply
sigbottleyesterday at 2:46 PM

Is quantization a mostly solved pipeline at this point? I thought that architectures were varied and weird enough where you can't just click a button, say "go optimize these weights", and go. I mean new models have new code that they want to operate on, right, so you'd have to analyze the code and insert the quantization at the right places, automatically, then make sure that doesn't degrade perf?

Maybe I just don't understand how quantization works, but I thought quantization was a very nasty problem involving a lot of plumbing

show 1 reply