A related argument I raised a few days back on HN:
What's the moat with with these giant data-centers that are being built with 100's of billions of dollars on nvidia chips?
If such chips can be built so easily, and offer this insane level of performance at 10x efficiency, then one thing is 100% sure: more such startups are coming... and with that, an entire new ecosystem.
You'd still need those giant data centers for training new frontier models. These Taalas chips, if they work, seem to do the job of inference well, but training will still require general purpose GPU compute
I think their hope is that they’ll have the “brand name” and expertise to have a good head start when real inference hardware comes out. It does seem very strange, though, to have all these massive infrastructure investment on what is ultimately going to be useless prototyping hardware.
Nvidia bought all the capacity so their competitors can't be manufactured at scale.
If I am not mistaken this chip was build specifically for the llama 8b model. Nvidia chips are general purpose.
RAM hoarding is, AFAICT, the moat.