logoalt Hacker News

lsr: ls with io_uring

320 pointsby mpweiheryesterday at 12:40 PM153 commentsview on HN

https://tangled.sh/@rockorager.dev/lsr


Comments

rockorageryesterday at 3:46 PM

Author of the project here! I have a little write up on this here: https://rockorager.dev/log/lsr-ls-but-with-io-uring

show 6 replies
ninkendoyesterday at 1:21 PM

I wonder how it performs against an NFS server with lots of files, especially one over a kinda-crappy connection. Putting an unreliable network service behind blocking POSIX syscalls is one of the main reasons NFS is a terrible design choice (as can be seen by anyone who's tried to ctrl+c any app that's reading from a broken NFS folder), but I wonder if io_uring mitigates the bad parts somewhat.

show 4 replies
swiftcoderyesterday at 4:51 PM

Kind of fascinating that slashing syscalls by ~35x (versus the `ls -la` benchmark) is "only" worth a 2x speedup

show 2 replies
maplantyesterday at 2:31 PM

This seems more interesting as demonstration of the amortized performance increase you'd expect from using io_uring, or as a tutorial for using it. I don't understand why I'd switch from using something like eza. If I'm listing 10,000 files the difference is between 40ms and 20ms. I absolutely would not notice that for a single invocation of the command.

show 2 replies
Imustaskforhelpyesterday at 1:24 PM

Really interesting, the difference is real though I would just hope that some better coloring support could be added because I have "eza --icons=always -1" command set as my ls and it looks really good, whereas when I use lsr -1, yes the fundamental thing is same, the difference is in the coloring.

Yes lsr also colors the output but it doesn't know as many things as eza does

For example .opus will show up as a music icon and with the right color (green-ish in my case?) in eza whereas it would be shown up as any normal file in lsr.

Really no regrets though, its quite easy to patch I think but yes this is rock solid and really fast I must admit.

Can you please create more such things but for cat and other system utilities too please?

Also love that its using tangled.sh which is using atproto, kinda interesting too.

I also like that its written in zig which imo feels way more easier for me to touch as a novice than rust (sry rustaceans)

show 3 replies
SillyUsernameyesterday at 1:13 PM

Love it.

I'm trying to understand why all command line tools don't use io_uring.

As an example, all my nvme's on usb 3.2 gen 2 only reach 740MB/s peak.

If I use tools with aio or io_uring I get 1005MB/s.

I know I may not be copying many files simultaneously every time, but the queue length strategies and the fewer locks also help I guess.

show 9 replies
_ache_today at 3:56 AM

Why doesn't it scale ? I'm really interested in n=100.000 and n=1.000.000.

eza could catch up with lsr. Why is n=1.000 so slow with lsr ? Does io_uring add so much overhead ?

the8472yesterday at 2:13 PM

io_uring doesn't support getdents though. so the primary benefit is bulk statting (ls -l). It'd be nice if we could have a getdents in flight while processing the results of the previous one.

show 1 reply
pvtmertyesterday at 6:26 PM

> I have no idea what lsd is doing. I haven’t read the source code, but from viewing it’s strace, it is calling clock_gettime around 5 times per file. Why? I don’t know. Maybe it’s doing internal timing of steps along the way?

Maybe calculating "X minutes/hours/days/weeks ago" thing for each timestamp? (access, create, modify, ...). Could just be an old artifact of another library function...

show 1 reply
jasonjmcgheeyesterday at 4:16 PM

I find it funny that there are icons for .mjs and .cjs file extensions but not .c, .h, .sh

buybackoffyesterday at 8:41 PM

A little offtop, but do you know a number in usecs that io_uring can save on enterprise grade servers, with 10G NICs, for socket latency overheads vs LD_PRELOAD when hardware supports that? Let's say it's Mellanox 4 or 5. My understanding is that each gives around 10us savings, maybe less. Based on some benchmarking, which was not focused on any of those explicitly but had some imprecise experiments. It also looks like they do not add up. Do you have a number based on real experience?

tlnyesterday at 2:28 PM

The times seem sublinear, 10k files is less than 10x 1k files.

I remember getting in to a situation during the ext2 and spinning rust days where production directories had 500k files. ls processes were slow enough to overload everything. ls -F saved me there.

And filesystems got a lot better at lots of files. What filesystem was used here?

It's interesting how well busybox fares, it's written for size not speed iirc?

show 2 replies
quibonoyesterday at 1:45 PM

Lovely, I might try doing this for some other "classic" utility!

A bit off-topic too, but I'm new to Zig and curious. This here: ``` const allocator = sfb.get();

    var cmd: Command = .{ .arena = allocator };
``` means that all allocations need to be written with an allocator in mind? I.e. one has to pick an allocator per each memory allocation? Or is there a default one?
show 2 replies
Benderyesterday at 1:25 PM

I am curious what would happen if ls and other commands were replaced using io_uring and kernel.io_uring_disabled was set to 1. Would it fall back to an older behavior or would the ability to disable it be removed?

show 2 replies
adgjlsfhk1yesterday at 1:55 PM

It's a shame to see uutils doing so poorly here. I feel like they're our best hope for an organization to drive this sort of core modernization forward, but 2x slower than GNU isn't a good start.

show 1 reply
mnw21camyesterday at 6:00 PM

Love the idea and execution, don't love the misplaced apo'strophe's.

show 1 reply
ReDressyesterday at 2:07 PM

I've been playing around with io_uring for a while.

Still, I am yet to come across a some tests that simulate typical real life application workload.

I heard of fio but are yet to check how exactly it works and whether it might be possible to simulate real life application workload with it.

show 1 reply
neuroelectronyesterday at 1:27 PM

There used to be lsring by Jens Axboe (author of io_uring), but it no longer exists. This is more extreme than abandoning the project. Perhaps there is some issue with using io_uring this way, perhaps vulnerabilities are exposed.

show 1 reply
fermuchyesterday at 1:14 PM

The link isn't working for me. For those who were able to see it: does it improve anything by using that instead of what ls does now??

show 3 replies
movomitoyesterday at 1:12 PM

Link doesn’t work

show 1 reply
apiyesterday at 3:06 PM

Why isn’t it possible — or is it — to make libc just use uring instead of syscall?

Yes I know uring is an async interface, but it’s trivial to implement sync behavior on top of a single chain of async send-wait pairs, like doing a simple single threaded “conversational” implementation of a network protocol.

It wouldn’t make a difference in most individual cases but overall I wonder how big a global speed boost you’d get by removing a ton of syscalls?

Or am I failing to understand something about the performance nuances here?

show 3 replies
rkangelyesterday at 1:37 PM

This was more interesting for the tangled.sh platform it's hosted on. Wasn't aware of that!

show 2 replies
danbrucyesterday at 1:45 PM

Why does this require inventing lsr as an alternative to ls instead of making ls use io_uring? It seems pretty annoying to have to install replacements for the most basic command line tools. And especially in this case, where you do not even do it for additional features, just for getting the exact same thing done a bit faster.

show 6 replies