What type of hardware do I need to run a small model like this? I don't do Apple.
1.5B models can run on CPU inference at around 12 tokens per second if I remember correctly.
1.54GB model? You can run this on a raspberry pi.
1.5B models can run on CPU inference at around 12 tokens per second if I remember correctly.