Same here, I use Qwen 3.6 27b (Q6 quant) with llama.cpp on an RTX 5090 using the pi agent exclusivel...

heipei • today at 6:40 PM • 0 replies • view on HN

Same here, I use Qwen 3.6 27b (Q6 quant) with llama.cpp on an RTX 5090 using the pi agent exclusively now. The fact that it's local means that I never have to think about token pricing, quotas, time of day, or data sensitivity. I have limited the GPU from 600W to 450W which means the system stays whisper quiet during inference.

I have become so "lazy" (in a good way), so far that I've started using the model for lots of daily mundane things on top of just coding:

  * "commit this on a branch, push, create a PR and assign $nickname for review"
  * "Use the Stripe CLI to download all open and overdue invoices and reconcile them with this CSV export from our bank account."
  * "Use these Elasticsearch credentials to summarise what kind of operations are causing load at the moment."
  * "Tell me if our codebase already supports X and where it's  implemented."

alt Hacker News