This is still far away from being viable for actually useful models, like bigger MoE ones with much ...

comandillos • yesterday at 10:13 PM • 0 replies • view on HN

This is still far away from being viable for actually useful models, like bigger MoE ones with much larger context windows. I mean, the technology is very promising just like Cerebras, but we need to see whether they are able to keep up this with the evolution of the models to come in the next few years. Extremely interesting nevertheless.

alt Hacker News