Thanks! I'm seeing a 10/90 split between CPU/GPU with gemma4:26b, so I guess there's at least something to win there by adding the other GPU. And perhaps something to win by connecting the monitor to the iGPU instead to free up VRAM, from what I gather.
Just in case someone should be interested in how a consumer PC setup like this performs, still using only 1x RTX 5080 + 64GB system RAM and Intel Ultra 270K-Plus; I tested Qwen3.6:35b-a3b now (using ollama and default settings) and I'm getting around ~86 t/s. The lowest I've seen so far is 70 t/s. The CPU/GPU split with 35b is 39/61% (with 4K 165 fps monitor connected to 5080, so there's probably some room for optimization here by moving it to the iGPU).
Best thing is that this setup is basically dead silent (it could, hypothetically speaking, be running in my bedroom just fine, and I'm a light sleeper).