logoalt Hacker News

rapatel0today at 1:48 AM0 repliesview on HN

I got qwen3.6:27B running on my 4090 (24GB) with ~128K context leveraging some of the recent turboquant/rotorquant memory optimizations for activations. Highly suggest going up to that. the q4_xl+rotorquant combo is pretty good.

Some reference code if you want to throw your agent at it. https://github.com/rapatel0/rq-models