logoalt Hacker News

cylemonsyesterday at 3:17 PM3 repliesview on HN

How else would you "modify" immutable data other than by copying it?


Replies

ndryesterday at 3:24 PM

You don't have to copy everything.

Look up persistent data structures [0]

The main trick is to store everything in a tree with a large branching factor.

The elements are on your leaves. When you want to make an edit you need to create only the nodes from the root to the actual leaf. Then you return the new root. Everything else you can share.

This is a log operation, normally implemented with branch 32.

It works with vectors as well as dicts. Creating a copy is constant, it's immutable can be shared freely (especially on GC'd languages).

Insert/Delete are O(log32(n)) instead of O(n) they'd be with a full copy.

[0] https://en.wikipedia.org/wiki/Persistent_data_structure#Pers...

thechaoyesterday at 3:26 PM

And, yet, somehow, functional languages have been doing this since forever? I jest, I jest! There's an entire field of study focused on this! I'm no expert, but imagine you've got a singly-linked list, and we're going to allow only push_back() and pull_back() (no insert()). Let's go ahead and let multiple owners have at this list. If A does "pull_back()" what really happens is A has a local "tail" pointer that moves back along the end of the list and has a new belief about the end of the list. When A does "push_back()" it begins attaching links at its new tail. Since the links just "point backwards" without modifying the old links, this doesn't mutate the list "as is", it only mutates A's version of the list. So, if B is also looking at the list, it could just start doing "push_back()" and add its own "tail". The result is a data-structure that's a "bunch of shared singly linked lists" in a bush structure that's more akin to a tree than a singly linked list.

You can do the same thing with binary trees and other structures. It's more fiddly, and there's definitely places where single-ownership allows mutability "under the hood" for real performance gains, but that's the basic idea.

show 1 reply
tylerhouyesterday at 11:21 PM

Basically, a proxy. You don't need to deep copy; you just need to return a proxy object that falls back to the original dict if the key you are requesting has not been found or modified.

Functional data structures essentially create a proxy on every write. This can be inefficient if you make writes in batches, and you only need immutability between batches.