logoalt Hacker News

Dwedit01/17/20252 repliesview on HN

Wouldn't branchless UTF-8 encoding always write 3 bytes to RAM for every character (possibly to the same address)?


Replies

ack_complete01/18/2025

CPUs are surprisingly good at dealing with this in their store queues. I see this write-all-and-increment-some technique used a lot in optimized code, like branchless left-pack routines or overcopying in the copy handler of an LZ/Deflate decompressor.

show 1 reply
ngoldbaum01/17/2025

You could do two passes over the string, first get the total length in bytes, then fill it in codepoint by codepoint.

You could also pessimistically over-allocate assuming four bytes per character and then resize afterwards.

With the API in the linked blog post it's up to the user to decide how they want to use the output [u8;4] array.