NOTE: I'm a fan of value semantics, mostly devil's advocate here. Those implicit copies ...

sirwhinesalot • last Saturday at 5:45 PM • 1 reply • view on HN

NOTE: I'm a fan of value semantics, mostly devil's advocate here.

Those implicit copies have downsides that make them a bad fit for various reasons.

Swift doesn't enforce value semantics, but most types in the standard library do follow them (even dictionaries and such), and those types go out of their way to use copy-on-write to try and avoid unnecessary copying as much as possible. Even with that optimization there are too many implicit copies! (it could be argued the copy-on-write makes it worse since it makes it harder to predict when they happen).

Implicit copies of very large datastructures are almost always unwanted, effectively a bug, and having the compiler check this (as in Rust or a C++ type without a copy constructor) can help detect said bugs. It's not all that dissimilar to NULL checking. NULL checking requires lots of extra annoying machinery but it avoids so many bugs it is worthwhile doing.

So you have to have a plan on how to avoid unnecessary copying. "Move-only" types is one way, but then the question is which types do you make move-only? Copying a small vector is usually fine, but a huge one probably not. You have to make the decision for each heap-allocated type if you want it move-only or implicitly copyable (with the caveats above) which is not trivial. You can also add "view" types like slices, but now you need to worry about tracking lifetimes.

For these new C alternative languages, implicit heap copies are a big nono. They have very few implicit calls. There are no destructors, allocators are explicit. Implicit copies could be supported with a default temp allocator that follows a stack discipline, but now you are imposing a specific structure to the temp allocator.

It's not something that can just be added to any language.

Replies

netbioserror • last Saturday at 5:54 PM

And so the size of your data structures matters. I'm processing lots of data frames, but each represents a few dozen kilobytes and, in the worst case, a large composite of data might add up to a couple dozen megabytes. It's running on a server with tons processing and memory to spare. I could force my worst case copying scenario in parallel on each core, and our bottleneck would still be the database hits before it all starts.

It's a tradeoff I am more than willing to take, if it means the processing semantics are basically straight out of the textbook with no extra memory-semantic noise. That textbook clarity is very important to my company's business, more than saving the server a couple hundred milliseconds on a 1-second process that does not have the request volume to justify the savings.

➕ show 1 reply

alt Hacker News

Replies