> SSD streaming to GPU Is this solution based on what Apple describes in their 2023 paper '...

firstbabylonian • today at 3:01 PM • 3 replies • view on HN

> SSD streaming to GPU

Is this solution based on what Apple describes in their 2023 paper 'LLM in a flash' [1]?

1: https://arxiv.org/abs/2312.11514

Replies

simonw • today at 3:10 PM

Yes. I collected some details here: https://simonwillison.net/2026/Mar/18/llm-in-a-flash/

➕ show 3 replies

zozbot234 • today at 3:33 PM

A similar approach was recently featured here: https://news.ycombinator.com/item?id=47476422 Though iPhone Pro has very limited RAM (12GB total) which you still need for the active part of the model. (Unless you want to use Intel Optane wearout-resistant storage, but that was power hungry and thus unsuitable to a mobile device.)

➕ show 2 replies

foobiekr • today at 4:11 PM

This is not entirely dissimilar to what Cerebus does with their weights streaming.

➕ show 1 reply

alt Hacker News

Replies