From the article:
> I was later surprised all the real world find implementations I examined use tree-walk interpreters instead.
I’m not sure why this would be surprising. The find utility is totally dominated by disk IOPS. The interpretation performance of find conditions is totally swamped by reading stuff from disk. So, keep it simple and just use a tree-walk interpreter.
Is it truly simpler to do that? A separate “command line to byte codes” module would be way easier to test than one that also does the work, including making any necessary syscalls.
Also, decreasing CPU usage many not speed up find (much), but it would leave more time for running other processes.
Yeah that's basically what was discussed here: https://lobste.rs/s/xz6fwz/unix_find_expressions_compiled_by...
And then I pointed to this article on databases: https://notes.eatonphil.com/2023-09-21-how-do-databases-exec...
Even MySQL, Duck DB, and Cockroach DB apparently use tree-walking to evaluate expressions, not bytecode!
Probably for the same reason - many parts are dominated by I/O, so the work on optimization goes elsewhere
And MySQL is a super-mature codebase
The assumption that "find" performance is dominated by disk IOPS is not generally valid.
For instance, I normally compile big software projects in RAM disks (Linux tmpfs), because I typically use computers with no less than 64 GB of DRAM.
Such big software projects may have very great numbers of files and subdirectories and their building scripts may use "find".
In such a case there are no SSD or HDD I/O operations, everything is done in the main memory, so the intrinsic performance of "find" may matter.