I don't get how this can work, and Moxie (or rather his LLM) never bothers to explain. How can...

CamperBob2 • today at 4:31 PM • 2 replies • view on HN

I don't get how this can work, and Moxie (or rather his LLM) never bothers to explain. How can an LLM possibly exchange encrypted text with the user without decrypting it?

The correct solution isn't yet another cloud service, but rather local models.

Replies

FrasiertheLion • today at 5:00 PM

The model is running in a secure enclave that spans the GPU using NVIDIA Confidential Computing: https://www.nvidia.com/en-us/data-center/solutions/confident.... The connection is encrypted with a key that is only accessible inside the enclave.

Within the enclave itself, DRAM and PCIe connections between the CPU and GPU are encrypted, but the CPU registers and the GPU onboard memory are plaintext. So the computation is happening on plaintext data, it’s just extremely difficult to access it from even the machine running the enclave.

boramalper • today at 4:40 PM

They explain it in Private inference [0] if you want to read about it.

[0] https://confer.to/blog/2026/01/private-inference/

alt Hacker News

Replies