> But on a tangent, why do you believe in mixture of experts In a hardware inference approach y...

amelius • today at 11:37 AM • 0 replies • view on HN

> But on a tangent, why do you believe in mixture of experts

In a hardware inference approach you can do tens of thousands tokens per second and run your agents in a breadth first style. It is all very simply conceptually, and not more than a few years away.

alt Hacker News