Wouldn't branchless UTF-8 encoding always write 3 bytes to RAM for every character (possibly to the same address)?
You could do two passes over the string, first get the total length in bytes, then fill it in codepoint by codepoint.
You could also pessimistically over-allocate assuming four bytes per character and then resize afterwards.
With the API in the linked blog post it's up to the user to decide how they want to use the output [u8;4] array.
You could do two passes over the string, first get the total length in bytes, then fill it in codepoint by codepoint.
You could also pessimistically over-allocate assuming four bytes per character and then resize afterwards.
With the API in the linked blog post it's up to the user to decide how they want to use the output [u8;4] array.