Also, > Thread stacks come up because reserving them completely ahead of time would incur large...

inkyoto • yesterday at 3:48 AM • 1 reply • view on HN

Also,

> Thread stacks come up because reserving them completely ahead of time would incur large amounts of memory usage. Typically they start small and grow when you touch the guards. This is a form of overcommit.

Ahead of the time memory reservation entails a page entry being allocated in the process’s page catalogue («logical» allocation), and the page «sits» dormant until it is accessed and causes a memory access fault – that is the moment when the physical allocation takes place. So copying reserved but not accessed yet pages has zero effect on the physical memory consumption of the process.

What actually happens to the thread stacks depends on the actual number of active threads. In modern designs, threads are consumed from thread pools that implement some sort of a run queue where the threads sit idle until they get assigned a unit of work. So if a thread is idle, it does not use its own stack thread and, consequently, there is no side effect on the child's COW address space.

Granted, if the child was copied with a large number of active threads, the impact will be very different.

> Even windows dynamically grows stacks like this

Windows employs a distinct process/thread design, making the UNIX concept of a process foreign. Threads are the primary construct in Windows and the kernel is highly optimised for thread management rather than processes. Cygwin has outlined significant challenges in supporting fork(2) semantics on Windows and has extensively documented the associated difficulties. However, I am veering off-topic.

Replies

barchar • yesterday at 5:42 AM

I am aware reserving excess memory doesn't commit said memory. But it does reserve memory, which is what we were talking about. The point was that because you can have a lot of threads and restricting reserved stacks to some small value is annoying all systems overcommit stack. Windows initially commits some memory (reserving space in the page file/ram) for each but will dynamically commit more when you touch the guard. This is overcommit. Linux does similarly.

Idle threads do increase the amount of committed stack. Once their stack grows it stays grown, it's not common to unmap the end and shrink the stacks. In a system without overcommit these stacks will contribute to total reserved phys/swap in the child, though ofc the pages will be cow.

> Windows employs a distinct process/thread design, making the UNIX concept of a process foreign. Threads are the primary construct in Windows and the kernel is highly optimised for thread management rather than processes. Cygwin has outlined significant challenges in supporting fork(2) semantics on Windows and has extensively documented the associated difficulties. However, I am veering off-topic.

The nt kernel actually works similarly to Linux w.r.t. processes and threads. Internally they are the same thing. The userspace is what makes process creation slow. Actually thread creation is also much slower than on Linux, but it's better than processes. Defender also contributes to the problems here.

Windows can do cow mappings, fork might even be implementable with undocumented APIs. Exec is essentially impossible though. You can't change the identity of a process like that without changing the PID and handles.

Fun fact: the clone syscall will let you create a new task that both shares VM and keeps the same stack as the parent. Chaos results, but it is fun. You used to be able to share your PID with the parent too, which also caused much destruction.

➕ show 1 reply

alt Hacker News

Replies