What are they going to train with 13.5M really? We're a tiny company in Amsterdam in Holland an...

jurschreuder • today at 10:07 PM • 2 replies • view on HN

What are they going to train with 13.5M really? We're a tiny company in Amsterdam in Holland and we've got "only 64x B300 to train on" so we could never make an LLM I thought, since we've got only 4M in compute.

And they're going to train an LLM with all kinds of extra difficulties compared to OpenAI for just 13.5M?

The very first Llama was 16M for one training.

Replies

numeri • today at 10:35 PM

Prices for training have dropped immensely in terms of research required, code efficiency, algorithmic/sample efficiency, and possibly also hardware (I'm not qualified to say without looking it FLOPS/dollar, or even to be certain that's the right metric here).

LaurensBER • today at 10:12 PM

This is too little, too late. Europe really need to start focussing.

All these tiny niche models are perhaps fun as an academic exercise or great for the researchers resume but I highly doubt that they'll add any value or will be used for anything serious.

Even if this becomes a somewhat decent model with a fantastic understanding of "gezellig", "kring verjaardag" or "pannenkoeken", how many people will interact with it before the limits of it will drive them back to a frontier model?

Even if the purpose of this is government & other regulated industries, do we really want our government to use a poor model? Either do it right or don't do it at all.

alt Hacker News

Replies