Two big advantages: You avoid an unnecessary copy. Normal read system call gets the data from disk...

kennethallen • yesterday at 4:25 AM • 3 replies • view on HN

Two big advantages:

You avoid an unnecessary copy. Normal read system call gets the data from disk hardware into the kernel page cache and then copies it into the buffer you provide in your process memory. With mmap, the page cache is mapped directly into your process memory, no copy.

All running processes share the mapped copy of the file.

There are a lot of downsides to mmap: you lose explicit error handling and fine-grained control of when exactly I/O happens. Consult the classic article on why sophisticated systems like DBMSs do not use mmap: https://db.cs.cmu.edu/mmap-cidr2022/

Replies

nextaccountic • yesterday at 5:44 PM

> Consult the classic article on why sophisticated systems like DBMSs do not use mmap: https://db.cs.cmu.edu/mmap-cidr2022/

Sqlite does (or can optionally use mmap). How come?

Is sqlite with mmap less reliable or anything?

➕ show 2 replies

commandersaki • yesterday at 8:24 AM

you lose explicit error handling

I've never had to use mmap but this is always been the issue in my head. If you're treating I/O as memory pages, what happens when you read a page and it needs to "fault" by reading the backing storage but the storage fails to deliver? What can be said at that point, or does the program crash?

saidinesh5 • yesterday at 5:31 AM

This is a very interesting link. I didn't expect mmap to be less performant than read() calls.

I now wonder which use cases would mmap suit better - if any...

> All running processes share the mapped copy of the file.

So something like building linkers that deal with read only shared libraries "plugins" etc ..?

➕ show 1 reply

alt Hacker News

Replies