North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
Re: links on the blink (fwd)
>I bet myself you would respond the way you did before I finished processing >my mail bag. I won. Sure, 100% packet loss is eminently acceptable if that >loss rate occurs not more than 1% of the time. Maybe 10% packet loss I suppose that is correct. 100% packet loss for 1% of my *connections*, I can live with that. We also lived with high delays (like 0.5 sec satellite round trip on USAN, I don't mean those occasional 75 seconds in some, ahem, other environments at that time), as long as it was predictable. What drives me nuts is if services are unpredictable, like immediate packet services at the 40% percentile, a few seconds and occasional losses for another 40% of the packets, and the opportunity to keep up with Dave's technical field between keystrokes for the remaining 20%. What does a consistent 10% packet loss mean? I think it has little to do with telco line qualities these days, more with resource contention. What is contended? A T1 line (or whatever) is never, ever congested, as it is just a simple bit pipe. The contention is at the equipment, called routers, that aggregate more traffic on the inbound interfaces than it can dump onto the outbound interfaces (e.g., for outbound line capactity reasons, and buffers then filling up in the router). Historically that was often a router problem, as they were too slow to deal with the onslaught of packets for a plain packet-per-second-rate (remember, in 1987 the NSF solicitation asked for a then whopping 1000 packets per second per router, which was just barely achievable then). Today you can buy technology off the shelf that does not have a pps problem for typical situations. So what is the problem, if it is not the rouuter interconnection or the router technology? The answer is bad network engineering, little consideration for architectural requirements, and lack of understanding for the Internet workload profile. Intra-NSP, perhaps even more among NSPs. Or, in other words, it is people that kill the network, not the routers or phone lines, particularly people who are trying to make money off it, probably using their unique optimization function focused on profit and limiting expenses as much as they can, not understanding the fate sharing yet. A constant 10% packet loss (or any constant packet loss) means that the network is underprovisioned. The *deployed* Internet depends heavily on massive aggregation of microscopic transactions (e.g., 10 packets or so at the 50% percentile of transaction, and tens of thousands of them perhaps in parallel). These aggregations result in some degree of steady state, but also burst behavior, which even in a well designed network can result in occasional losses due to overutilization of resources. But it should not happen consistently, if someone were to claim it to be a well designed and implemented network. The conventional wisdom says to upgrade the capacity (e.g., more bandwidth to improve the outflow from the routers) to handle the additional load in cases of resource contention. Can be an expensive undertaking, and an administrative nightmare, especially in the international realm. May be a bandaid could be a prioritization of traffic, so that more deservy traffic gets better services. For example, for my email 10% packet losses I would typically not even know about, but most interactive connections (call it lazyness, stupidity, whatever) create several packets per keystroke, with their demands for end-to-end echos (hey, Dave, you did a bad job of technology transfer out of Fuzzballs, as you got it right, line mode by default, going into character mode only if really necessary; i.e., proof that it is possible to do it right was available 10 years ago). Prioritization can be left to either the service provider (who may have to hide it; and it is also very hard to serve everyone right that way), or by the end-user. If the end-user specifies a service profile (be it IP Precedence or whatever) it will only work if there is a penalty for higher service qualities (e.g., quotas, or precedence-0 is free, higher ones cost the end-user (or someone else in the path who's pain would be desirable here) something). You would still need to understand the workload profile eventually, simple utilization graphs won't suffice if you compare the common microscopic transactions with those exhibiting a high bandwidth * duration product (e.g., them new multimedia thingies). Anyway, no magic here either. This issues are on the table for many years already, nothing new, though if the Internet is to survive, some service providers probably need to adjust their optimization models.