Yeah ... and so it happens that this particular function in the profile is just a symptom, merely being an observation (single) data point of system behavior under given workload, and not the root cause for, let's say, load instruction burning 90% of the CPU cycles by waiting on some data from the memory, and consequently giving you a wrong clue about the actual code creating that memory bus contention.
I have to say that up until I grasped a pretty good understanding of CPU internals, memory subsystem, kernel, and generally the hardware, reading into the perf profiles was just a fun exercise giving me almost no meaningful results.