I agree that is performant enough for many applications, I work in the field. But it isn't perf...

sbszllr • yesterday at 8:32 PM • 1 reply • view on HN

I agree that is performant enough for many applications, I work in the field. But it isn't performant enough to run large scale LLM inference with reasonable latency. Especially not when we compare the throughput numbers for a single-tenant inference inside a TEE vs batched non-private inference.

Replies

ramoz • yesterday at 8:39 PM

We just served Deepseek R1 on this bad boy in CC+TEE (and an integrated signing layer we developed for vLLM).

https://pasteboard.co/k1hjwT7pWI6x.png

reach out if interested in collab.

alt Hacker News

Replies