These are based on the Gemma 3n architecture so E2B only needs 2Gb for text2text generation:

nolist_policy • today at 6:48 PM • 0 replies • view on HN

You can think of the per layer-embeddings as a vector database so you can in theory serve it directly from disk.

alt Hacker News