Given the context, I'm thinking bad cache keys resulting in spurious cache misses, where the keys are built in some low-level way. Cache misses almost certainly have a bigger asymptotic impact than extra copies, unless that copy constructor is really heavy.
I'm just remembering a performance issue I heard of eons ago where a sorting function comparison callback inadvertently allocated memory. It made sorting very slow. Someone said in a meeting that sorting was slow, and we all had a laugh about "shouldn't have used the bubble sort!" But it was the key comparison doing something stupid.