logoalt Hacker News

A 40-line fix eliminated a 400x performance gap

97 pointsby bluestreakyesterday at 11:00 PM20 commentsview on HN

Comments

ottoday at 12:53 AM

You can do even faster, about 8ns (almost an additional 10x improvement) by using software perf events: PERF_COUNT_SW_TASK_CLOCK is thread CPU time, it can be read through a shared page (so no syscall, see perf_event_mmap_page), and then you add the delta since the last context switch with a single rdtsc call within a seqlock.

This is not well documented unfortunately, and I'm not aware of open-source implementations of this.

EDIT: Or maybe not, I'm not sure if PERF_COUNT_SW_TASK_CLOCK allows to select only user time. The kernel can definitely do it, but I don't know if the wiring is there. However this definitely works for overall thread CPU time.

show 1 reply
goodroottoday at 1:33 AM

QuestDB and the team are among the best doing it.

Love the people and their software.

Great blog Jaromir!

jerrinotyesterday at 11:13 PM

Author here. After my last post about kernel bugs, I spent some time looking at how the JVM reports its own thread activity. It turns out that "What is the CPU time of this thread?" is/was a much more expensive question than it should be.

show 3 replies
ee99eetoday at 12:07 AM

This is such a great writeup

higherhalftoday at 12:41 AM

clock_gettime() goes through vDSO, avoiding a context switch. It shows up on the flamegraph as well.

show 3 replies