As of b8966, it is still not great.
| model | size | params | backend | ngl | test | t/s |
| --------------------- | --------: | ------: | ------- | --: | -----: | -------------: |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | SYCL | 999 | pp2048 | 851.81 ± 6.50 |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | SYCL | 999 | tg128 | 42.05 ± 1.99 |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | Vulkan | 999 | pp2048 | 2022.28 ± 4.82 |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | Vulkan | 999 | tg128 | 114.15 ± 0.23 |
| qwen35 27B Q6_K | 23.87 GiB | 26.90 B | SYCL | 999 | pp2048 | 299.93 ± 0.40 |
| qwen35 27B Q6_K | 23.87 GiB | 26.90 B | SYCL | 999 | tg128 | 14.58 ± 0.06 |
| qwen35 27B Q6_K | 23.87 GiB | 26.90 B | Vulkan | 999 | pp2048 | 581.99 ± 0.86 |
| qwen35 27B Q6_K | 23.87 GiB | 26.90 B | Vulkan | 999 | tg128 | 10.64 ± 0.12 |
Edit: I've no idea why one would use gpt-oss-20b at Q8, but the result is basically the same: | model | size | params | backend | ngl | test | t/s |
| --------------------- | --------: | ------: | ------- | --: | -----: | -------------: |
| gpt-oss 20B Q8_0 | 11.27 GiB | 20.91 B | SYCL | 999 | pp2048 | 854.16 ± 6.06 |
| gpt-oss 20B Q8_0 | 11.27 GiB | 20.91 B | SYCL | 999 | tg128 | 44.02 ± 0.05 |
| gpt-oss 20B Q8_0 | 11.27 GiB | 20.91 B | Vulkan | 999 | pp2048 | 2022.24 ± 6.97 |
| gpt-oss 20B Q8_0 | 11.27 GiB | 20.91 B | Vulkan | 999 | tg128 | 114.02 ± 0.13 |
Hopefully, support for the B70 will continue to improve. In retrospect, I probably should have bought a R9700 instead...