logoalt Hacker News

zozbot234today at 3:33 PM2 repliesview on HN

A similar approach was recently featured here: https://news.ycombinator.com/item?id=47476422 Though iPhone Pro has very limited RAM (12GB total) which you still need for the active part of the model. (Unless you want to use Intel Optane wearout-resistant storage, but that was power hungry and thus unsuitable to a mobile device.)


Replies

Aurornistoday at 4:06 PM

> Though iPhone Pro has very limited RAM (12GB total) which you still need for the active part of the model.

This is why mixture of experts (MoE) models are favored for these demos: Only a portion of the weights are active for each token.

show 1 reply
simonwtoday at 3:42 PM

Yeah, this new post is a continuation of that work.