logoalt Hacker News

CamperBob2today at 4:31 PM2 repliesview on HN

I don't get how this can work, and Moxie (or rather his LLM) never bothers to explain. How can an LLM possibly exchange encrypted text with the user without decrypting it?

The correct solution isn't yet another cloud service, but rather local models.


Replies

FrasiertheLiontoday at 5:00 PM

The model is running in a secure enclave that spans the GPU using NVIDIA Confidential Computing: https://www.nvidia.com/en-us/data-center/solutions/confident.... The connection is encrypted with a key that is only accessible inside the enclave.

Within the enclave itself, DRAM and PCIe connections between the CPU and GPU are encrypted, but the CPU registers and the GPU onboard memory are plaintext. So the computation is happening on plaintext data, it’s just extremely difficult to access it from even the machine running the enclave.

boramalpertoday at 4:40 PM

They explain it in Private inference [0] if you want to read about it.

[0] https://confer.to/blog/2026/01/private-inference/