So is this potentially performance improving?.
Last time I tested branchless UTF-8 algorithms, I came to the conclusion that they only perform [slightly] better for text consisting of foreign multibyte characters. Unless you expect lots of such inputs on the hot path, just go with traditional algorithms instead. Even in the worst case the difference isn't that big.
Sometimes people fail to appreciate how insanely fast a predictable branch really is.
Usually people are interested in branchless implementations for cryptography applications, to avoid timing side channels (though you then have to make sure that the generated instructions don't have different timing for different input values), and will pay some time penalty if they have to.