True. But in my experience, the pattern of just using short lived goroutines via errgroup or a channel based semaphore, will typically get you full utilization across all cores assuming your limit is high enough.
Perhaps less guaranteed in patterns that feed a fixed limited number of long running goroutines.