logoalt Hacker News

The beauty and simplicity of the good old C-style void* in C++

52 pointsby movd128last Saturday at 6:21 PM100 commentsview on HN

Comments

kstenerudtoday at 3:39 PM

The real question here is: WHY are you passing a blob of memory rather than a struct that uses the type system to describe and enforce what the contents are?

I don't mean dressing up an anonymous pointer, which the author rightly complains about. I mean WHY are you making an API that takes such a pointer to an unknown type to begin with? Whenever you change the structure within that blob, your type checker won't flag that the receiver hasn't been updated to handle it.

Even worse: nothing's stopping you from accidentally passing in the wrong type.

And now you have a SEGV. Or a security hole.

show 2 replies
stinostoday at 10:41 AM

> Why should people complexify and uglify their C++ code with the uint8_t pointer (or std::byte), when void* works just fine??

Fair point (although to be honest: 'complexify' feels a bit of an exaggeration here to me), but the answer to this why is simple: document and express intent clearly. The compiler gave you an error first such that you're forced to consider what you're doing. Any seasoned C++ developer seeing this knows what this reinterpret_cast means.

> Wow. With std::span the complexity-meter bumps in the red zone and goes even higher!

Same remark: yes, it's a bit more text to read, but again: to me (and many others I'm guessing) this clearly expresses intent. I also do not find it particularly hard to read. I mean, it's C++, you're likely going to encounter templates at one point or another, except in super specific software perhaps. But no-one also ever argued the C++ learning curve was easy, and trying to make it easier by refusing to use features which were added for good reasons and instead going back to constructs which are the very source of those reasons seems a bit backwards.

> As a nice addition, if you use SAL annotations, the function could be decorated a bit to help code analyzers detecting memory bugs

Some might also say it complexifies and uglifies the code. And in any case makes it non-portable on top of that.

show 5 replies
cherryteastaintoday at 2:04 PM

In C++ you'd just do

    template <typename T>
    void DoSomething(T const& data);
or if T* is supposed to point to a tightly packed buffer

    template <typename T>
    void DoSomething(std::span<T> data);
as the author pointed out. I don't see how that is ugly or more complicated than the original void* approach.

There is no need to pass the size of T or length of the span, former is just a sizeof(T) away and latter is a data.size(); away.

In fact, a lot of codebases would outright ban the uint8_t* and reinterpret_cast trick the author is complaining about via clang-tidy rules.

gignicotoday at 10:20 AM

> It seems that some people are really losing the taste for good readable code.

It seems that some people never had taste for good reliable code. Use `void ` and now any error whatsoever is a direct undefined behavior. Moreover `std::span` clearly says that you are not* taking ownership of the memory (even though the language does not check it of course), while `void *` does not.

I understand that people can have many things to say about C++, and I do as well, but `std::span` should have been there decades ago and is such a life saver in these situations. A truly zero-cost abstraction which effectively saves you from a lot of troubles.

show 5 replies
IX-103today at 12:20 PM

Ick. The entire article starts from the fundamentally flawed premise that "you want a function that takes a blob of memory as an argument". Then they discuss bytewise access into structures..

Passing around void pointers is simply not a safe thing to do in C++. You can't do anything with a void pointer, so you're probably going to cast it as something else. Use that type instead, so that your caller knows they need to pass a valid pointer to that type. If the pointer has the wrong alignment then that will result undefined behavior. If you need to support multiple pointer types, use templates.

And, unless there are some really weird circumstances, you actually don't want to access your structures bytewise. Offsets can shift with compiler flags/versions. If you want serialization , please use a serialization library that correctly handles all of the odd cases. These can be quite efficient.

I've only actually had to munge bytes in a class once. Somebody decided that a previously POD class that was passed between processors with different memory spaces needed a virtual function, so I had to overwrite the vtable when I received it to make it valid.

show 1 reply
gpderettatoday at 3:38 PM

Making DoSomething a template because span is a template is a non-sequitur.

If DoSomething works with untyped bytes, it should require a std::span<byte> (or const byte if read only). Incidentally the standard provides a convenient as_bytes(std::span<T>)->std::span<byte>; There isn't an as convenient helper to convert a singular object to a span of bytes, but it is easy to write.

As to why one should use span, is that a) it helps making sure that the size travels together with the pointer for some additional safety, b) it is more convenient to work with byte ranges than void ptrs (which do not support pointer math), c) helps a bit communicating intent: in C++ void* are used more often for type erasure than for byte related things.

DennisL123today at 3:49 PM

<snark> Arguably the one thing C++ is great at is its type system. Makes total sense to cast it away. </snark>

adrian_btoday at 2:32 PM

The "void*" of C solves a frequently encountered problem, but it has an inappropriate and misleading name, because such a pointer points to something, it does not point to nothing. Moreover, it does not have a size.

The correct solution in a programming language is to have a primitive bit string type (with a length that is a byte multiple) and to have a concise way (e.g. with dedicated symbolic operators) to write a type conversion from any data type to a bit string and a type conversion from a bit string to any data type.

Then the operations that make sense for arbitrary bit strings, e.g. copying, moving, input/output operations (e.g. file read and write), applying Boolean functions, shall have formal parameters of this type.

Much of what I have described here already existed in the language IBM PL/I, more than 60 years ago, except that in it only the conversion towards a bit string was explicit, with the built-in function "bit", while the conversions from bit strings to other data types were done implicitly, upon variable assignment.

Like any kind of array, a bit string must have an associated size, so there should be no need to specify it explicitly as a separate parameter.

ndesaulnierstoday at 3:37 PM

I find this point to be generally why C can typically beat C++ in terms of code size; generic functions operating on void* are much less type safe, but the tradeoff is code size. Those template instantiations add up.

rfgplktoday at 2:59 PM

> void DoSomething(const void* p, size_t numBytes)

would be something like

template <typename T> void DoSomething (const T& ref) or void DoSomething(const T& ref, size_t numBytes) or C++20-y void DoSomething (const auto& ref)

If the class you're passing in already qualifies a size like member fn, template<typename T> requires requires(T t){ t.size(); } void DoSomething(const T& x){ ... x.size(); }

> void DoSomething(const uint8_t* p, size_t numBytes)

This is awful you lose type info irreversibly.

> template <typename T> void DoSomething(std::span<T> data)

You can do this but the above examples work just as well.

> Or maybe something even more complicated, like this?

template <typename T, std::size_t N> void DoSomething(std::span<T, N> data)

// Or this? template <typename T, std::size_t N> void DoSomething(std::span<const T, N> data)

This is more explicit, not more complicated...

> In this way, we still keep the clarity and simplicity of the function invocation: > DoSomething(&data, sizeof(data));

Stripping types is not a good idea, especially because you'll run into object lifetime issues _REALLY QUICKLY_. You need to guarantee that the object is trivially copyable.

nine_ktoday at 3:31 PM

I first wanted to compare the use of void * to the use of a chainsaw. But then I realized that a chainsaw was many more safety features.

voidUpdatetoday at 10:23 AM

> "An interesting question you may ask in C++ is: “How would you declare a function that takes a blob of memory as input?”"

> "Now, suppose that you want to pass to this function a custom structure, like this:"

You would create another function that actually works based off that structure, rather than using your first function which operates on a set of bytes in memory. That way it's readable, like they want, and type-safe

show 2 replies
delta_p_delta_xtoday at 11:06 AM

The blogger and the blog says:

> BTW: As a nice addition, if you use SAL annotations

> Windows C++ Programming

Not everyone will see the irony, but the Windows user-mode application and library suite and the kernel now very heavily rely on the safety mechanisms of C++ that the author calls 'complex', 'uglif[ied]', and has 'los[t] the taste for good readable code'. I'm of course referring to the Windows Implementation Library: https://github.com/microsoft/wil This is explicitly an effort from MS WinDev to make Windows C++ code safer. User-mode applications writing native Windows code can and absolutely should use it, too.

Any time I see `void*` in C++ I ring-fence it as a C-ism and make sure I `reinterpret_cast`. For me, a bag of bytes is `std::span<std::byte>`. void* is a memory location with no provenance, no ownership, no size information, nothing. Do I even know if it is this program's memory, or some shared memory construct, or maybe even a pointer into GPU memory? No for all.

C likes to play fast and loose and its proponents call it 'beautiful and simple', I call it a segfault/use-after-free/double-free waiting to happen.

show 1 reply
mwkaufmatoday at 1:39 PM

If you know you want bytes -- A void* of unknown provenance cast to anything other than char* is UB so just skip the middleman and use char*.

show 1 reply
arcadialeaktoday at 10:36 AM

char* is an exception to strict aliasing rules of C++ precisely to facilitate the author's use case. You would still need a reinterpret_cast to make it work, but it's actually good because it makes the intent clearer, and the cast would have still happened either way to read the raw bytes.

show 1 reply
jayd16today at 2:05 PM

Here's a thought experiment. Is void* something we should add to other languages?

Would anyone argue yes?

zahlmantoday at 1:38 PM

Many years ago I remember going through the Boost library and seeing C-style casts that seemed entirely gratuitous. I tried replacing them with what I was pretty sure were the equivalent C++ reinterpret_casts, and the result didn't compile. I never did figure it out.

show 2 replies
akkaygintoday at 11:38 AM

> In fact, std::span is a class template, and somebody would suggest to make the function that processes the generic memory blob a function template! Really? Something like this??

Yes.

delegatetoday at 10:56 AM

It depends on what your function does with that memory. If the fn expects any kind of structure at that address, you and your callers are on your own, compiler can't help if the caller passes the wrong thing. Worse, acessing that memory might not immediately crash, but lead to strange side effects in your program.

Dynamic languages can handle this with reflection, but with void* you can only pray nobody makes the mistake..

void-startoday at 3:17 PM

This!

… Is why I picked my name.

jeffbeetoday at 3:17 PM

While I sympathize with the aesthetic theme of this post, I warn against the temptation to do what the post proposes, which is to try to compute the checksum of an object represented by void pointer and extent. There are many dangers here, one of which is that the checksum may read uninitialized memory, making the checksum meaningless. Another is that the pointer implicitly converted to void may be a different address depending on the type of the object in the calling function, if the type has multiple base classes. Further, your void reader may be reading derived class data you were unaware of, such that hashing a Base pointer twice yields different results because a member of Derived was placed by the compiler in the tail padding of Base.

In other words, don't do this. C++17 introduced has_unique_object_representations type trait which tells whether it is safe to do this to a given type. It is pretty much always false.

arka2147483647today at 10:43 AM

The best part of void* is that it is very terse. Both in definitions, and in access.

All cpp alternatives are more wordy.

I wonder how this conversation wound go if the was an as terse, but also typesafe cpp alternative.

pwdisswordfishqtoday at 12:09 PM

> “Hey, why do you use the unsafe old C-style void* pointer?

Exactly, one should avoid unnecessarily erasing pointer target types. Luckily, C++ gives much better tools for that than C. This should have been a tem—

> Use some safe explicit type like uint8_t, which clearly represents an 8-bit byte!”

Sigh. Out of the frying pan, into the fire.

drysinetoday at 2:28 PM

>a function that hashes some input data (using SHA-256, or whatever hash algorithm)

Along with padding bytes.

> Why should people complexify and uglify their C++ code with the uint8_t pointer (or std::byte), when void* works just fine??

That was the intention of reinterpret_cast - make ugly code look ugly.

singpolyma3today at 1:48 PM

Next realize you can just use C instead of C++ at all!

themafiatoday at 10:31 AM

I'm not a fan of C++ precisely because of template noise but what you gain with span, in that the pointer and the length are joined together, seem to outweigh the complaints on style.

Isn't there a way to make this an alias anyways?

squirrelloustoday at 10:54 AM

One could argue the reinterpret_cast makes the intent more explicit which is a good thing.

That said I don’t have much against the use of void* or even char* here. If it works in C, it works in C++ just fine. std::span is not the right tool for this.

apitoday at 11:53 AM

Real programmers use uintptr_t for pointers.

_the_inflatortoday at 11:06 AM

I think that the author is right in everything he says and yes, there is beauty in it.

However, the antithesis is also correct that there exist better solutions to solve the issues.

Both premises hold true.

I have an extensive assembler coding background on 6510, M68000, and i486. I had a very hard time accepting that something could be solved faster and more stable in a higher order language while the downside is more memory, more CPU etc.

More and more it turns out that programming languages are something accidentally read by machines and written by humans, even though this premise got destroyed lately by AI.

However, what I love about C++ is, that it has a basic canon of commands that can be used to build nearly everything while looking extremely ugly and hard to grasp if you don't read very slowly and accurately - so it is a very error prone and dangerous thing that rightfully got substituted by better constructs that allow for better distinctions as well as usage.

I could do everything in assembler (Hey Python users: you know that in the end everything ends up as machine code, don't you?) but it takes 100x times longer and is constantly reinventing the wheel.

Have you ever started to get into the intricacies of bit signs? No? Well, you should definitely, and to this day it gave me a lasting impression when I started wrapping my head around it, when I was 10 to 11 years old hacking my way into the world of assembler programming on C128.

You don't want to take every concept into consideration. You don't want to take interoperability into consideration. All the time!

You want to focus on the problem to solve, not the implications of the implementations all the time.

I am having such a blast very often using Python since it just works with much cognitive distraction about which language construct to use in order to get the machine doing what you want. It is so capable, enable it, to simply ensure within boundaries that the compiler uses the best decision given the context, which is up to analysis.

That's why I stopped using C++ or more precisely stopped any attempts and trying to be smart or fancy. I got to re-read and maintain the code month to years later and history showed, I don't marvel at how magic the line works and brutally smart I was at the time, but simply hate me for obscuring something in a line, that could be well understood if I had used 10 lines, while the compiler gives a damn anyway.

C++ is still necessary but every discussion to this day is about the point you made: every digit counts - and also which position, context etc. You got to be very prolific in order to put into a line what other put into 10.

Is it worth it? No.

In early days it was the correct decision. Memory was sparse, CPU power slow, and the language was small compared to today.

The last time I felt comfortable with a "assembler kind feeling" was with JavaScript before ES6. Peak jQuery level, with the most coolest concept only JavaScript has: Function.prototype.toString()

John Resig will have his place in my programming heroes olymp, who revealed this secret for me, and it opened my eyes for the beauty of higher order languages.

I admire C++, but so do I Python.

But I hope I won't have to ever use C++ again.

show 3 replies
sylwaretoday at 11:31 AM

[dead]

Uptrendatoday at 10:27 AM

[flagged]

adev_today at 11:17 AM

This post post is honestly speaking a bag of garbage and ill advises:

> Some good old habit from C can still be positively used in C++, like the void* pointer and the size parameters.

That's garbage.

There is a clear interest of passing both size AND pointer in a single parameter like `std::span<std::byte>: It bind both value together and guarantee that you do not mess with the size of your buffer.

Pass "data" and "size" parameters through a chain of 5 function calls and there is a non-null probability that you passed "other_size" instead of "size" somewhere. This pattern happens everywhere in old C codebase and has been the source of countless security vulnerabilities and random buffer overflows for decades.

All modern languages (including freaking minimalist Golang) have now a "slice/span" concept built in.

It is not just to annoy programmers (and allow them to complain about 'complexity' in blog posts) but because it is a major improvement in term of memory safety and in term of reducing user errors.

> It seems that some people are really losing the taste for good readable code.

If 'span<std::byte>' or 'span<char>' are unreadable for you. The problem is not span, the problem is you.

These are concepts that has been existing for decades in almost all modern programming languages.

Even in conservative C++, it exists since 2014 in the GSL, in Qt and in boost.

And the interface is no different from vector...no excuse here... It is itself the most basic data-structure in C++.

> Why should people complexify and uglify their C++ code with the uint8_t pointer (or std::byte), when void* works just fine??

Sure. Let's extend the logic: I do propose also to replace all typed arguments with a void* pointer.

Because after all: 'It will just works fine' right ?

Type-safety and clear interface are overrated, we could all use only bytes and remove interface all together to get a closer experience of Fortran 77.

/irony

> Or maybe something even more complicated, like this? > template <typename T, std::size_t N> void DoSomething(std::span<T, N> data)

First that is non-sense.

If you want to pass a mutable buffer of byte, the correct signature is:

``void DoSomething(std::span<std::byte> data)``

There is no need for template signature here. You are making things up.

Second, there is also no need for the N parameter

``span<Type,N>`` is only used when enforcing a buffer with its size known at compile time is desirable. It can be for vectorization (e.g buffer is a multiple of the SIMD line) or to make it explicit in the interface (e.g for bloc cipher for instance)

> states that the pointer points to input read-only memory (_In_reads_)

You do that by using `std::span<const std::byte>` in any C++ codebase.

The fact he brags about that as "an advantage" for separated parameter passing just show currently how little is known here.

> My Pluralsight Courses

The kind of C++ code proposed in this blog post would be straight be refused in any PR in almost any serious organization with a proper review process.

So bragging about it on a blog while proposing some C++ teaching is audacious to say the least.

> To finish on that.

The sad thing is that there would be very valid criticism on `std::span<std::byte>`:

- Span does not do boundary check on access by default. Which is a bad design decision in 2026.

- It has an impact on compilation time due to the header inclusion

- std::byte is annoying to work with because it is a hack around an enum instead of a proper C++ builtin type.

But the blog post misses all these points entirely and sticks to complaining about 'Old C being better' the same way your family Grand-Uncle still brags about 'lead gasoline being better' for his 70s Pontiac.

bcjdjsndontoday at 12:10 PM

Makes you wonder why OP is using cpp to begin with if theyre suggesting void*