North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
Re: Deploying IPv6 in a datacenter (Was: Awful quiet?)
On Dec 21, 2005, at 2:09 AM, Jim Popovitch wrote:
We've just gone through a pretty decent sized attempt to convert our infrastructure and applications to IPv4/IPv6 dual stack, and was asked by someone else to write up the successes and problems I had doing this. I'm no where near done writing everything up, or even finished with the migration attempt, but here's some bits of my notes, describing our real world experience to throw in. I'll add that I'm small potatoes to you guys out there, I'm sure those with much larger networks will face some additional challenges.
Our biggest client had a need for IPv6 for a couple of reasons.
The first was an application (which surprisingly was IPv6 ready) that required a unique IP address "per instance", they needed to be able to handle tens of thousands of "instances" to work around some brokenness in the application, and RFC1918 wouldn't cut it. Using made up IPv6 addresses with no IPv6 connectivity worked, but we wanted to do it "right".
The second was because of IRC and some Japanese students. This client has a pretty thriving chat community going, based around IRC. One niche of customers and users for this site that suddenly exploded in popularity was with some east asian(mostly Japanese) students, using these services from their dorm rooms or computer labs. The workstations themselves run IPv4, but the university's backbone was IPv6 only. The side effect of this was that all non HTTP IPv4 connections were going through some kind of proxy server, that had a time limit on how long you could keep a session open. They were getting booted off constantly, and the IT staff at the university was asking them to find an IPv6 IRC server they could use instead. (It's possible I'm misrepresenting the situation here, I didn't have direct contact with the people involved, so the technical details here might be wrong).
The third was for VPN. A new product is going to use potentially hundreds of VPNs, again RFC1918 addresses won't work for this application, and we're trying to be a good neighbor and not blow several /24s worth of space on something that didn't really need it. GRE tunnels worked fine for this, and allowed us to do IPv6 inside the tunnel over IPv4 transit and preserve our tiny IPv4 allocations.
But yeah, those are three really weak needs for requiring IPv6, but I'm guessing it's going to be niches like this that start the need for IPv6 at all. So, we decided we'd try to make our network IPv6 enabled, and see how hard it would be to run everything dual stack that would support it.
The first wakeup call that this wasn't going to be easy was that only one of our transit providers supported IPv6 in any way at all. After we got IPv6 turned up, the problems we discovered, roughly in order:
1) IPv6 on the internet overall seems a bit unreliable at the moment. Entire /32's disappear and reappear, gone for days at a time. The most common path over IPv6 from the US to Europe is US->JP->US->EU. I realize this may be specific to our connection itself, but browsing looking glasses seems to back up that it's not just us.
2) Lots of providers who are running IPv6 aren't taking it as seriously as IPv4. Manual prefix filters, NOC staff that doesn't even know they're running IPv6, etc.
3) Some key pieces of internet infrastructure are IPv6 oblivious. ARIN's route registry doesn't support the "route6" objects, for example.
4) Even though we went through our applications and software before starting this to check for IPv6 readiness, there's a huge difference between "Supports IPv6" and "actually works with IPv6".
5) Our DNS software(djbdns) supports IPv6, kind of. WIth patches you can enter AAAA records, but only by entering 32 digit hexadecimal numbers with no colons or abbreviations. We were never able to get it to respond to queries over IPv6, so of all our DNS is still IPv4.
6) Some software supports IPv6 and IPv4 at the same time by using IPv4 addresses on IPv6 sockets. (i.e. they bind to tcp6 on port 80, and an incoming ipv4 connection appears as ::ffff:192.168.0.1). Other applications want to bind to two different sockets. Others want you to run two copies of themselves, one for IPv6 an one for IPv4. However, on a BSD system, the setting net.inet6.ip6.v6only is a system-wide configuration option. If you turn it off (allow IPv4 connections to come in on IPv6 sockets) for one application running on the server that requires it off, and you're running a difference service on the same server that wants to run IPv4 and IPv6 separately, you have to make sure the IPv6 daemon doesn't start before the IPv4, or it will bind to both protocols and the IPv4 daemon won't be able to bind to the port.
7) We found that even if applications support IPv6, they default to disabling it during compiling 95% of the time. We had to go back and recompile our HTTP servers, PHP, all sorts of libraries, etc. Since IPv6 defaults to off on a lot of packages, we had issues just getting the software to BUILD with ipv6 turned on unless we were using the exact same libraries as were current when it was last released.
8) Once we got everything on the network and server side ready for and usable on IPv6 we discovered that a lot of our client's applications just had no idea what to do with IPv6 connections. Many PHP applications broke because they expected $_SERVER['REMOTE_ADDR'] to fit within 15 characters at most. Databases had to have their columns widened (if they were storing the address as text), or functionality had to be rewritten if they were storing IPs as 32 bit integers. Web server log analyzers claimed that the log was "corrupted" if it had an IPv6 address in it. Lots and lots of application logic just wasn't IPv6 aware at all, and either had serious cosmetic problems with displaying IPv6 addresses, or simply didn't work when an IPv6 address was encountered.
9) Once we started publishing AAAA records for a few sites, we started getting complaints from some users that they couldn't reach the sites. Some investigating showed that they had inadvertently enabled IPv6 on their desktop without having any IPv6 connectivity. In some cases it was traced to users reading about 6to4 and wanting to play with it but not correctly installing it, then not UNinstalling it. Others had turned on IPv6 on OS's that make it easy to do so (OS X for example) without realizing what it was. Some browsers/OSes seem better than others at figuring out that they don't have IPv6 working and falling back to IPv4.
10) Smaller than normal MTUs seem much more common on IPv6, and it is exposing PMTUD breakage on a lot of people's networks.
11) Almost without fail, the path an IPv6 user takes to reach us (and vice-versa) is less optimal than the IPv4 route. Users are being penalized for turning on IPv6, since they have no way to fall back to IPv4 on a site-by-site basis when using a web browser.
In the end, we've backed out almost all of our changes to make sites IPv6 visible for now. It broke things for far more IPv4 users than it helped IPv6 users. We've left the IRC services running on a dual stack system, since we were able to partition IPv6 off well enough not to hurt things. Everything else is back to IPv4 only.
I'm also personally a bit concerned with how IPv6 allocation and routing works with respect to small to medium sized networks like ours. I know this is still a hot topic and several proposals are being passed around to resolve some of these issues, but it seems like I *lose* functionality with IPv6 that I have with IPv4, mostly due to the "don't deaggregate your allocation" mantra, and how far the bar was raised to get PI space. We do a lot of things things in IPv4 land with regard to routing and addressing that I don't believe we can do in IPv6, which worries me more. Shim6 and other proposals are creative, but don't replace a lot of the functionality I'd be losing. This is another story though, that is getting really off topic.
Getting your network running IPv6 doesn't seem to be the challenge anymore. None of our L2 devices cared at all. Our L3 devices took some configuration, but moved pretty easily. it's the server and application software that needs a lot more work. I don't think we're even close to the point where an end-user can go to their provider and say "IPv6 me!" and get it working for more hassle than it's worth to them.