logoalt Hacker News

IPv6 zones in URLs are a mistake

86 pointsby xenayesterday at 9:42 PM70 commentsview on HN

Comments

Sohcahtoa82yesterday at 11:46 PM

It gets worse than that.

The Python `ipaddress` library has an `ip_address` address that returns either an IPv4Address or IPv6Address if the passed string is a valid IPv4 or IPv6 address, or throws a ValueError if the address is invalid.

I've seen code that uses that function to determine if a user-supplied string is a valid IP before passing it to a command line. At first glance, that seems fine, but some shell metacharacters are valid in the IPv6 zone ID.

`fe80::1%a;whoami>${PATH:0:1}tmp${PATH:0:1}pwned` is a valid IPv6 IP, and if you did `ping fe80::1%a;whoami>${PATH:0:1}tmp${PATH:0:1}pwned`, you'd have the output of `whoami` written to /tmp/pwned.

Obviously, people shouldn't writing code that puts user input into a shell call without the proper method of execution (ie, shell=False when using subprocess.Popen), but people often think "I validated it, it's fine" and then get popped because their validation wasn't as good as they thought it was.

EDIT: In case it isn't clear, `${PATH:0:1}` is necessary in the attack payload because a `/` is invalid in a zone ID. `${PATH:0:1}` is a tricky way to get a `/` character by just grabbing the first character of your PATH environment variable.

show 2 replies
evgpbfhnrtoday at 12:58 AM

And it gets even more fun when browsers such as firefox implemented this, then decided no we won't do it and removed the feature -- now there's no way to access your router web interface over link-local address...

(rationale being that whatwg said no: https://github.com/whatwg/url/issues/392 ; firefox bug https://bugzilla.mozilla.org/show_bug.cgi?id=700999 )

The "solution" is to use a proxy such as https://github.com/twisteroidambassador/prettysocks/tree/ipv... which incidentally encode the % as a `s` and handle special URLs like this http://fe80--1ff-fe23-4567-890as3.ipv6-literal.net for you through the socks dns resolution feature... I've never found anything else that works recently -_-

AshamedCaptainyesterday at 10:54 PM

You complain about URL encoding ? Enter UNC encoding ...

https://devblogs.microsoft.com/oldnewthing/20100915-00/?p=12...

> \\fe80--1ff-fe23-4567-890as3.ipv6-literal.net\share

show 1 reply
sedatkyesterday at 11:39 PM

That's a bit of a stretch. First, IPv4 can't handle this scenario at all. It's an IPv6 feature. So, let's just be thankful that this exists. Amen.

Second, if you don't want to use interface IDs, you can just enable ULAs on your networks, and routing will take you to the correct interface.

show 1 reply
Tharreyesterday at 10:53 PM

"IPv6 is weird. One of the more strange parts of the standard is that every interface's link local addresses are in fe80::whatever`."

How is IPv6 weird here, it's the exact same thing in IPv4, no? If you have two different network interfaces, you have to identify which is which somehow, either by assigning a specific IP range to it or by adding some kind of identifier.

Making zones part of addresses in the first place was probably a mistake, I agree, but the problem of address conflicts when users can choose arbitrary addresses certainly isn't a design flaw of IPv6.

show 5 replies
jchwyesterday at 11:05 PM

I ran into some of these issues when working on IPv6 validation in a library. I found that if you just call system functions like inet_pton, you would also get OS-dependent restrictions on what zone identifiers are valid! This isn't ideal so I wound up just making an IPv4/IPv6 parser with a very liberal zone ID production. Said library also supported URLs, and I did not implement it to parse the IPv6 literal as percent encoded in this edge case, but it winds up working both ways anyways. Is this good? Maybe not: maybe it would've been better to pick a strict subset instead. However, whether or not that would be better depends on specific use cases. Unfortunately, there is just no perfect answer sometimes.

yjftsjthsd-hyesterday at 10:38 PM

> In order to disambiguate what's the host and what's the port, you typically format the IPv6 address in square brackets, so fe80::4 on port 80 would look like this:

> [fe80::4]:80

I really do wish they'd just stuck with dots. Or if we must upend things, commit to the bit and change the character to separate ports.

show 3 replies
epistasisyesterday at 10:41 PM

> And with the right scope it looks like this:

    [fe80::4%eth0]:80
> Now let's get URL encoding into the mix. ...

About here my I felt my heart start to beat really fast and I started to hyperventilate.

I'll just accept that this is as much of a nightmare as it seems.

show 2 replies
Dagger2today at 12:22 AM

I've never really got why this is so complicated. My interpretation of [] syntax in URLs is "[ enters into a raw address mode", "] exits the raw address mode" and "the characters between the brackets are opaque address characters to be passed to getaddrinfo()".

(It basically has to be this way, or the URL syntax would need to be updated to support future address families with their own address formats. New address families can be loaded at runtime, including ones that didn't exist at the time your current software was compiled -- and this is handled properly by the BSD socket API -- so hardcoding possible address formats is incorrect.)

The _only_ character that needs special handling is ], and if you're willing to declare that you can't be bothered to support link-local addresses at all then declaring that you'll support anything except addresses containing a "]" should be far easier.

gerdesjtoday at 12:08 AM

"so if you have a packet destined to fe80::4, how do you disambiguate it?"

Routing tables get you to the destination but I think the question is about which source address to use ie which network card/interface to use as source - after all, they are all in fe80::.

For a destination in fe80:: the OS will pick the one on the right interface (in effect the IPv6 version of ARP).

You never use fe80:: as a source for a network beyond fe80:: because it and they are link local addresses. You'll send to the default gateway/GoLR/etc unless you have more explicit routes and set your source address as your IPv6 "identity" which might be one of many.

Anyway, here's your problem:

"But if you try to parse this as a URL in Go, you get an error:"

Go needs fixing!

show 1 reply
OptionOfTtoday at 12:27 AM

In Rust there is the same problem. The `url::Url` library does not support `%<zone_id>`.

`http::Uri` does, and it accepts both `%` and `%25`.

https://play.rust-lang.org/?version=stable&mode=debug&editio...

lxgryesterday at 10:40 PM

Are URLs of link local addresses a common thing with IPv6? I don’t think I’ve ever encountered one myself (but my home network supports ULAs and more importantly DNS).

show 5 replies
neildyesterday at 10:46 PM

> In theory, there is guidance for how to properly handle IPv6 zones in user interfaces in RFC 9884, but there's no such guidance for URLs.

RFC 6874: Representing IPv6 Zone Identifiers in Address Literals and Uniform Resource Identifiers (https://www.rfc-editor.org/rfc/rfc6874.html)

Which says that, yes, you need to %-encode the %, so a URL containing a host of fe80::4%eth0 becomes http://[fe80::4%25eth0]/. Yes, that's ugly. Sorry.

> TL;DR: computers were a mistake.

I agree entirely.

(For what it's worth, I am a maintainer of Go's net/url package, and I believe net/url correctly handles zone ids in URLs. It's always possible there's something wrong I'm not aware of. Please let me know if there is!)

show 3 replies
OptionXyesterday at 11:48 PM

I thought fe80::whatever was only for link local, and link local was only for 1-1 communication with router for SLAAC.

After you'd get a unique local than thebn would be used for normal routing needs.

Did I get the wrong?

show 1 reply
ghhhibhcyesterday at 10:28 PM

Nothing is more idiomatic Go than ignoring inconvenient edge cases.

show 2 replies
jasonjayryesterday at 10:31 PM

Also, thank you windows for not having consistent interface ids after reboot. I had to rewrite a configuration file every startup with powershell in order to tackle this case.

manytimesawaytoday at 1:05 AM

Ads on a blog you selfpost on HN is a new low.

show 1 reply
nickburnsyesterday at 11:42 PM

More strange. Stranger. This is strange. Stranger? Who are you?

rnxrxyesterday at 11:37 PM

[dead]

JackSlateuryesterday at 10:43 PM

TL;DR: computers were a mistake.

show 1 reply
singpolyma3yesterday at 11:01 PM

I don't even understand what's being complained about here. If you want a % in a Uri you need to encode it. It's not rocket science

show 1 reply