The author is confused about how performance tuning works. Step one, get it right. Step two, see if it's fast enough for the problem at hand.
There is almost never a step three.
But if there is, it's this: Step three: measure.
Now enter a loop of "try something, measure, go to step 2".
Of the things you can try, optimizing GC overhead is but one of many options. Arenas are but one of many options for how to do that.
And the thing about performance optimizations are that they can be intensely local. If you can remove 100% of the allocations on just the happy path inside of one hot loop in your code, then when you loop back to step two, you might find you are done. That does not require an arena allocator with global applicability.
Go gives realistic programmers the right tools to succeed.
And Go's limitations give people like the author plenty of ammunition to fight straw men that don't exist. Tant pis.
Step 3 is always useful (if not necessary) once you reach a certain scale.