I think you missed the point and don't understand / aren't considerate of SLM utility...

windexh8er • yesterday at 10:11 PM • 1 reply • view on HN

I think you missed the point and don't understand / aren't considerate of SLM utility.

Replies

But I’m not missing the point. If you can run one frontier model at 750t/s, then you can probably run many many instances of an SLM in parallel at a rate that exceeds 15k/s. That’s kinda the point of the flash or ultrafast variants. And they’re on something much more modern than llama3.1.

➕ show 1 reply

alt Hacker News

Replies