Thanks for posting the performance numbers from your own validation. 6-7 tokens/sec is quite remarkable for the hardware.
Some more benchmarking, and with larger outputs (like writing an entire relatively complex TODO list app) it seems to go down to 4-6 tokens/s. Still impressive.
Some more benchmarking, and with larger outputs (like writing an entire relatively complex TODO list app) it seems to go down to 4-6 tokens/s. Still impressive.