Is it faster for large models, or are the optimizations more noticeable with small models? Seeing th...

Gracana • yesterday at 4:49 PM • 1 reply • view on HN

Is it faster for large models, or are the optimizations more noticeable with small models? Seeing that the benchmark uses a 0.6B model made me wonder about that.

Replies

tough • yesterday at 7:33 PM

I have not tested it but its from a deepseek employee i don't know if it's used in prod there or not!

alt Hacker News

Replies