I know it's just a quick test, but llama 3.1 is getting a bit old. I would have liked to see a newer model that can fit, such as gpt-oss-120, (gpt-oss-120b-mxfp4.gguf), which is about 60gb of weights (1).
Correct, most of r/LocalLlama moved onto next gen MoE models mostly. Deepseek introduced few good optimizations that every new model seems to use now too. Llama 4 was generally seen as a fiasco and Meta haven't made a release since
That and more in https://github.com/geerlingguy/ai-benchmarks/issues/34