How do these compare to Apple's Foundation Models, btw?
So much better. Hard to quantify, but even the small Gemma 4 models have that feels-like-ChatGPT magic that Apple's models are lacking.
AFM had a 4096 token context window and this can be configured to have a 32k+ token context window, for one.
So much better. Hard to quantify, but even the small Gemma 4 models have that feels-like-ChatGPT magic that Apple's models are lacking.