I've played with bpf iterators and wrote a post about them [1]. The benefit of iterating over tasks instead of scanning procfs is a pretty astounding performance difference:
> I ran benchmarks on current code in the datadog-agent which reads the relevant data from procfs as described at the beginning of this post. I then implemented benchmarks for capturing the same data with bpf. The performance results were a major improvement.
> On a linux system with around 250 Procs it took the procfs implemention 5.45 ms vs 75.6 us for bpf (bpf is ~72x faster). On a linux system with around 10,000 Procs it took the procfs implemention ~296us vs 3ms for bpf (bpf is ~100x faster).
Thanks for including so many examples! Perhaps include one example output. Other than mention of the optional '--tree' parameter, it's unclear if the default result would be a list, table, JSON, etc.
This is neat but the examples comparing the tool against piping grep seem to counter the argument to me. A couple of pipes to grep seems much easier to remember and type, especially with all the quotes needed for psc. For scripts where you need exact output this looks great.
I'm not convinced with the need to embed CEL. You could just output json and pipe to jq.
how about comparing it to something sensible like osquery instead of doing silly strawman ps pipelines
An unfortunate name that triggers everybody who’s ever worked at Meta :)
> psc uses eBPF iterators to read process and file descriptor information directly from kernel data structures. This bypasses the /proc filesystem entirely, providing visibility that cannot be subverted by userland rootkits or LD_PRELOAD tricks.
Is there a trade off here?
Their first example is bad:
ps aux | grep nginx | grep root | grep -v grep
can be done instead (from memory, not at a Linux machine ATM): ps -u root -C nginx
which is arguably better than their solution: psc 'process.name == "nginx" && process.user == "root"'