Not sure how to formulate this, but what does this mean in the sense of how "smart" it is compared to the latest chatgpt version?
The implementation has no control on “how smart” the model is, and when it comes to llama 1B, it's not very smart by current standard (but it would still have blown everyone's mind just a few years back).
The model I'm running here is Llama 3.2 1B, the smallest on-device model I've tried that has given me good results.
The fact that a 1.2GB download can do as well as this is honestly astonishing to me - but it's going to laughably poor in comparison to something like GPT-4o - which I'm guessing is measured in the 100s of GBs.
You can try out Llama 3.2 1B yourself directly in your browser (it will fetch about 1GB of data) at https://chat.webllm.ai/