> I'd rather place that 10K on a RTX Pro 6000 if I was choosing between them.
One RTX Pro 6000 is not going to be able to run GLM-4.7, so it's not really a choice if that is the goal.
You definitely could, the RTX Pro 6000 has 96 (!!!) gigs of memory. You could load 2 experts at once at an MXFP4 quant, or one expert at FP8.
No, but the models you will be able to run, will run fast and many of them are Good Enough(tm) for quite a lot of tasks already. I mostly use GPT-OSS-120B and glm-4.5-air currently, both easily fit and run incredibly fast, and the runners haven't even yet been fully optimized for Blackwell so time will tell how fast it can go.