logoalt Hacker News

Unix "find" expressions compiled to bytecode

112 pointsby rcarmoyesterday at 12:35 PM17 commentsview on HN

Comments

drob518yesterday at 6:56 PM

From the article:

> I was later surprised all the real world find implementations I examined use tree-walk interpreters instead.

I’m not sure why this would be surprising. The find utility is totally dominated by disk IOPS. The interpretation performance of find conditions is totally swamped by reading stuff from disk. So, keep it simple and just use a tree-walk interpreter.

show 3 replies
tasty_freezeyesterday at 3:39 PM

That is a fun exercise, but I imagine the time to evaluate the conditional expression is a tiny fraction, just a percent or less, than the time it takes to make the file system calls.

show 2 replies
burnt-resistortoday at 4:02 AM

I recently wrote a "du" summarizer of additional stats in C because it's faster than du, find, or any sort of scripting language tree walker. The latter is orders of magnitude slower, but ultimately it's bounded by iteration of kernel vfs structures and any hard IOPS that are spent to fetch metadata from slower media.

For archiving, I also wrote a parallel walker and file hasher that only does one pass of data and stores results to a sqlite database. It's basically poor-man's IDS and bitrot detection.