> you need enormous VRAM laden farms of GPUs to do inference on a model like Opus 4.6. It'...

palmotea • last Thursday at 2:40 PM • 1 reply • view on HN

> you need enormous VRAM laden farms of GPUs to do inference on a model like Opus 4.6.

It's probably a trade secret, but what's the actual per-user resource requirement to run the model?

alt Hacker News