unfortunately the bigger models are pretty slow in token speed. The memory is just not that fast.
You can check what each model does on AMD Strix halo here:
https://kyuz0.github.io/amd-strix-halo-toolboxes/