I assume your concern with GPU passthrough is that each VM needs a whole GPU? You can use GPU-PV to split your GPU between VM instances. Then the main bottleneck becomes how thin you split out your VRAM.
More info here:
https://web.archive.org/web/20231107182321/https://mu0.cc/20...