Unix "find" expressions compiled to bytecode

112 points • by rcarmo • yesterday at 12:35 PM • 17 comments • view on HN

Comments

From the article:

> I was later surprised all the real world find implementations I examined use tree-walk interpreters instead.

I’m not sure why this would be surprising. The find utility is totally dominated by disk IOPS. The interpretation performance of find conditions is totally swamped by reading stuff from disk. So, keep it simple and just use a tree-walk interpreter.

➕ show 3 replies

tasty_freeze • yesterday at 3:39 PM

That is a fun exercise, but I imagine the time to evaluate the conditional expression is a tiny fraction, just a percent or less, than the time it takes to make the file system calls.

➕ show 2 replies

burnt-resistor • today at 4:02 AM

I recently wrote a "du" summarizer of additional stats in C because it's faster than du, find, or any sort of scripting language tree walker. The latter is orders of magnitude slower, but ultimately it's bounded by iteration of kernel vfs structures and any hard IOPS that are spent to fetch metadata from slower media.

For archiving, I also wrote a parallel walker and file hasher that only does one pass of data and stores results to a sqlite database. It's basically poor-man's IDS and bitrot detection.

alt Hacker News

Unix "find" expressions compiled to bytecode

Comments