logoalt Hacker News

jcgltoday at 7:49 AM1 replyview on HN

> * Hosts react to the change by reconfiguring via SLAAC and/or DHCPv6, depending on the settings in the RA

This is the linchpin of the workflow you've outlined. Anecdotal experience in this area suggests it's not broadly effective enough in practice, not least because of this:

> * Existing client connections are still dead, [1] but the host gets to know that their global IP address has changed and has a chance to take action, rather than being entirely unaware

The old IP addresses (afaiu/ime) will not be removed before any dependent connections are removed. In other words, the application (not the host/OS) is driving just as much as the OS is. Imo, this is one of the core problems with the scenario, that the OS APIs for this stuff just aren't descriptive enough to describe the network reconfiguration event. Because of that, things will ~always be leaky.

> [1] I think whether they die slow or fast depends on how the router is configured

Yeah, and that configuration will presumably be sensitive to what caused the failover. This could manifest differently based on whether upstream A simply has some bad packet loss or whether it went down altogether (e.g. a physical fault).

In any case, this vision of the world misses on at least two things, in my view:

1. Administrative load balancing (e.g. lightly utilizing upstream B even when upstream A is still up

2. The long tail of devices that don't respond well to the flow you outlined. It's not enough to think of well-behaved servers that one has total control over; need to think also of random devices with network stacks of...varying quality (e.g. IOT devices)


Replies

simonciontoday at 9:12 AM

> The old IP addresses (afaiu/ime) will not be removed before any dependent connections are removed.

I have two reactions to this.

1) Duh? I'm discussing a failover situation where your router has unexpectedly lost its connection to the outside world. You'd hope that your existing connections would fail quickly. The existence of the deprecated IP shoudn't be relevant because the OS isn't supposed to use it for any new connections.

2) If you're suggesting that network-management infrastructure running on the host will be unable to delete a deprecated address from an interface because existing connections haven't closed, that doesn't match my experience at all. I don't think you're suggesting this, but I'm bringing it up to be thorough.

> ...the OS APIs for this stuff just aren't descriptive enough to describe the network reconfiguration event.

I know that Linux has a system (netlink?) that's descriptive enough for daemons [0] to actively nearly-instantaneously start and stop listening on newly added/removed addresses. I'd be a little surprised if you couldn't use that mechanism to subscribe to "an address has become deprecated" events. I'd also be somewhat surprised if noone had built a nice little library over top of whatever mechanism that is. IDK about other OS's, but I'd be surprised if there weren't equivalents in the BSDs, Mac OS, and Windows.

> In any case, this vision of the world misses on at least two things, in my view:

> 1. Administrative load balancing...

I deliberately didn't talk about load balancing. I expect that if you don't do that at a layer below IP, then you're either stuck with something obscenely complicated or you're doing something like using special IP stacks on both ends... regardless of what version of IP your clients are using.

> 2. The long tail of devices that don't respond well to the flow you outlined.

Do they respond worse than in the IPv4 NAT world? This and other commentary throughout indicates that you missed the point I was making. That point was that -unlike in the NATted world- the OS and the applications running in it have a way to plausibly be informed of the network addressing change. In the NAT case, they can only infer that shit went bad.

[0] ...like BIND and NTPd...