Why is this, I wonder? Aren't the models trained on about the same blob of huggingface web scra...

foobarian • today at 9:31 PM • 2 replies • view on HN

Why is this, I wonder? Aren't the models trained on about the same blob of huggingface web scrapes anyway? Does one tool do a better job of pre-parsing the web data, or pre-parsing the prompts, or enhancing the prompts? Or a better sequence of self-repair in an agent-like conversation? Or maybe more precision in the weights and a more expensive model?

Replies

blibble • today at 10:10 PM

> Why is this, I wonder?

because that's Microsoft's business model

their products are just just good enough to allow them to put a checkbox in a feature table to allow it to be sold to someone who will then never have to use it

but not even a penny more will be spent than the absolute bare minimum to allow that

this explains Teams, Azure, and everything else they make you can think of

➕ show 1 reply

elorant • today at 10:13 PM

Probably compute isn’t enough to serve everyone from a frontier LLM.

alt Hacker News

Replies