The model I'm running here is Llama 3.2 1B, the smallest on-device model I've tried that has given me good results.
The fact that a 1.2GB download can do as well as this is honestly astonishing to me - but it's going to laughably poor in comparison to something like GPT-4o - which I'm guessing is measured in the 100s of GBs.
You can try out Llama 3.2 1B yourself directly in your browser (it will fetch about 1GB of data) at https://chat.webllm.ai/
> that has given me good results.
Can you help somebody out of the loop frame/judge/measure 'good results'?
Can you give an example of something it can do that's impressive/worthwhile? Can you give an example of where it falls short / gets tripped up?
Is it just a hallucination machine? What good does that do for anybody? Genuinely trying to understand.
anyone else think 4o is kinda garbage compared to the older gpt4? as well as o1-preview and probably o1-mini.
gpt4 tends to be more accurate than 4o for me.