logoalt Hacker News

happytoexplainlast Sunday at 10:29 PM2 repliesview on HN

> poorly designed String API

Nope nope nope.

I have to agree strongly with my sibling commenter. Every other language gets it horribly wrong.

In app dev (Swift's primary use case), strings are most often semantically sequences of graphemes. And, if you at all care about computer science, array subscripting must be O(1).

Swift does the right thing for both requirements. Beautiful.

OK, yes, maybe they should add a native `nthCharacter(n:)`, but that's nitpicking. It's a one-liner to add yourself.


Replies

tialaramexlast Monday at 12:43 AM

I don't think Rust gets this horribly wrong. &str is some bytes which we've agreed are UTF-8 encoded text. So, it's not a sequence of graphemes, though it does promise that it could be interpreted that way, and it is a sequence of bytes but not just any bytes.

In Rust "AbcdeF"[1] isn't a thing, it won't compile, but "AbcdeF"[1..=1] says we want the UTF-8 substring starting from byte 1 through to byte 1 and that compiles, and it'll work because that string does have a valid UTF-8 substring there, it's "b" -- However it'll panic if we try to "€300"[1..=1] because that's no longer a valid UTF-8 substring, that's nonsense.

For app dev this is too low level, but it's nice to have a string abstraction that's at home on a small embedded device where it doesn't matter that I can interpret flags, or an emoji with appropriate skin tones, or whatever else as a distinct single grapheme in Unicode, but we would like to do a bit better than "Only ASCII works in this device" in 2025.

show 2 replies
ks2048last Monday at 1:08 AM

I think using "extended grapheme clusters" (EGC) (rather than code points or bytes) is a good idea. But, why not let you do "x[:2]" (or "x[0..<2]") for s String with the first two EGCs? (maybe better yet - make that return "String?")

show 2 replies