qwen3.6 does a good job locally except it can take 20-30 minutes to respond to a prompt on a mac studio with 32gb of ram.
Yea you probably do want to use a GPU for models of that size.
I also wonder what quantization you are using? If you haven't tried other quants I really would
Apple Silicon before the M4 does not have matmul instructions which causes the prompt processing to be very slow. It's quite different on the M5, much like using a nvidia GPU