I don't think so. There are other knobs they can tweak to reduce load that affect quality less...

9cb14c1ec0 • yesterday at 5:46 PM • 1 reply • view on HN

I don't think so. There are other knobs they can tweak to reduce load that affect quality less than quantizing. Like trimming the conversation length without telling you, reducing reasoning effort, etc.

Replies

mgraczyk • yesterday at 9:48 PM

We never do anything that reduce model intelligence like that

alt Hacker News

Replies