North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
Re: MTU of the Internet?
Marc Slemko fills my screen with: > > The reason for this change cited by many customers is that many ISPs have > > 576 MTUs set "inside" their networks and packets get fragmented. > > I really don't buy that. Many or most backbone links have MTU >1500, and > MTUs <1500 outside of low-speed dialup connections aren't that common. > They are there, yes. But not that common. > > My understanding of why a lower MTU is demonstratable better under Win95 > is because the Win95 TCP stack is broken, and it is a good workaround. > Most of the people raving about it are saying they are getting 2-4 times > speed increases from changing their MTU from 1500 to 576. Something is > very wrong there. I thought I had heard details about exactly what is > broken in the Win95 TCP stack that causes this problem, but can't recall > them at the moment. It could have no basis in reality and just be a > rumour. I have my MTU set to 552 and it helps quite a bit. It's not an issue of the Win95 stack being broken. I'm running Linux. The reason I chose to use a low MTU, and sometimes I knock it down even further, is to be able to improve my telnet interactive response over a 33.6k link that I also run as many as 4 concurrent downloads on. Here's what I suspect is happening: With web surfing, a page loads with many images, each of which is often larger than a sliding window worth of packets. The browser will nearly concurrently connect and request for every image. Thus for N images you now get N sliding windows worth of packets slammed at you. This takes up a _lot_ of buffer space in the dialup routers for all these concurrent TCP connections all sending data at the same time over a high speed net to a low speed final link. With this happening, buffer space is exhausted and packets are discarded. If you set the MTU smaller, then the size of all those packets is smaller and the chance of being discarded due to memory exhaustion is reduced, even if you're the only one on that server with small packets. > There are all sorts of people spouting all sorts of lies around Windows > newsgroups about why small MTUs are good; I think novice users are simply > getting drawn in by supposed experts. Such as? I assert that small MTUs are good because in real life practice it actually does improve the effective responsiveness and reduces the lag causing loss of packets. Of course of something is broken, it needs fixing. Like maybe a lack of buffer space in terminal server? > I guess systems receiving data from servers with broken retransmission > timers (eg. how Solaris used to be) could be helped by lowering the MTU > which would result in faster ACKs so bogus retransmissions won't happen > all the time, but the fix for this really isn't to lower the MTU. Perhaps an initial transmission pace on the first few packets sent over a connection would help, too. There is virtually no value to send packets 1-7 immediately behind packet 0. The problem is there isn't any definition of what the pace should initially be. If I were to redesign TCP I would probably make the sliding window be small initially and gradually widen as the turnaround time becomes obvious, then pace the packets to match what the turnaround time looks like. > You also get the obvious improvements in interactive performance, and you > start getting data more quickly. > > I would suggest that you would be well advised to find a handy user or > four where this effect is easily observable, and find out what is really > going on. I suspect the buffer overflow situation. At least it should be looked at. I recall a few months ago a provider I was connected to was having some very horrible performance. My modem connection wasn't blocking at all so it was letting everything through. I started up a ping from a machine out on the net back to me, and noticed I was getting most packets through OK, yet my TCP connections were horrible. Well ping defaults to 64 bytes so I increased it to 1472 and virtually none of them ever got here. I tried a number of different sizes and found erratic performance but on average the loss was proportional to the packet size up to around 1k where there was near total loss. I dropped my MTU down to 64!! and reconnected the telnet session. Now I was getting through. I was getting a very obvious "small-chunk" kind of response performance, but it did let me through. As far as I could tell, that terminal server was out of buffers, perhaps being ping-flooded. Assuming a ping-flooder was intending to hurt that ISP, they could concurrently flood all addresses in the pool to fill as many buffers as possible. Since I was on a static IP outside of the pool I wouldn't have seen "my share" of the flood leaking through. Still, my 64 byte packets apparently could more readily grab the little bits of buffer space that opened up from time to time a lot easier than the big 1500 byte "monsters". After that event I tested out what telnet felt like under a number of different MTU values and settled on 552 and have been using it since and been quite happy with it. Some ISP may in fact be setting the MTUs on their routers lower to 576 or some other number just to force MTU discovery to kick in and chop the size of the packets. While I might consider doing this myself, I hesitate due to the fact that MTU discovery is often broken due to a number of reasons. Some sites are (stupidly) blocking all ICMP and if they also have DF set, the connection is hosed. In other cases, like the early version of the Cisco Local Director, the ICMP did not get back to the machine that held the connection (since ICMP network-unreachable doesn't include port numbers and LD's tables surely were port indexed) so again MTU discovery was broken and if DF was set, the connection was hosed. I saw this because many web servers apparently push the HTTP header part of the response ahead and it was smaller than my link MTU at the time so it came through OK, but the data that followed filled the packets and never made it. I saw the DF on the packet with the HTTP part and that's how I figured what was going on. So turning down the MTU in the router at the ISP can be a problem and should not be done, but turning down the MTU on the end machine will work, whether it is ultimately the correct solution or not. Users also perceive a better response if all the images are loading in parallel as opposed to them loading one at a time, even if the MTU setting to accomplish this smoothly has a net effect of a longer time for a complete load of all images. Maybe what we need is a multiplex-HTTP with a server smart enough to send the images over separate subchannels without the browser needing to request them. -- Phil Howard | email@example.com firstname.lastname@example.org email@example.com phil | firstname.lastname@example.org email@example.com firstname.lastname@example.org at | email@example.com firstname.lastname@example.org email@example.com milepost | firstname.lastname@example.org email@example.com firstname.lastname@example.org dot | email@example.com firstname.lastname@example.org email@example.com com | firstname.lastname@example.org email@example.com firstname.lastname@example.org