logoalt Hacker News

dagmx07/31/20251 replyview on HN

Other than just the niceness of the interface, a key one is that the M4 generation added profiling of CPU branching and afaik instruments is the only thing that supports it right now


Replies

verte_zerg07/31/2025

In the M4, Apple mostly added counters only for the SME engine. The full list of supported counters can be found in the official guide: https://developer.apple.com/documentation/apple-silicon/cpu-...

Regarding branch profiling, all arm64 (M1+) cpus support these counters: - BRANCH_CALL_INDIR_MISPRED_NONSPEC - BRANCH_COND_MISPRED_NONSPEC - BRANCH_INDIR_MISPRED_NONSPEC - BRANCH_MISPRED_NONSPEC - BRANCH_RET_INDIR_MISPRED_NONSPEC - INST_BRANCH - INST_BRANCH_CALL - INST_BRANCH_COND - INST_BRANCH_INDIR - INST_BRANCH_RET - INST_BRANCH_TAKEN

afaik there is no limitation to implementing the fetching of all these counters based on ibireme’s research on kperf. btw, forked "poop" already can fetch BRANCH_MISPRED_NONSPEC.