North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
Re: Global BGP - 2001-06-23 - Vendor X's statement...
> On Tue, 26 June 2001, "Chance Whaley" wrote: > > How would you like Vendor X to liberally handle the situation? There is > > a point when being too liberal causes issue - like this one. The idea is > > that if the original peer followed the spec it would of been contained > > at the source and this would of never happened. Where is the line? > > Something about GIGO comes to mind. > > I would prefer implementations (not vendors) reject the one router which > they don't like, and accept the other 100,000+ routes in the global Internet > without flapping BGP sessions. > > Killing 100,000 routes because you don't like one seems a bit excessive. But an invalid route should never be received. If it is, something is fundamentally wrong. It's not like, say, a CRC or Checksum error, which indicated that a packet got corrupted. That's a normal occurance and dropping the packet (and then moving on) is the right thing to do. But when we're talking about malformed BGP advertisements, we know with a high degree of certainty that the sender is broken. We don't know how broken, nor do we know the details of the brokenness, but we know that it's broken. So our options are: (1) Reject the one bogus route, accept everything else, thereby assuming that the brokenness extends only to that one route. (2) Send a NOTIFY, drop everything, because we don't want to be accepting routes from a known-to-be-broken route. (It is important to note that we don't know that the other, non-malformed, routes are good. We only know that they aren't malformed. They may or may not actually be valid.) Given that we're talking about the routing information for the core of the Internet, the most reasonable thing to do seems to be (2): Discard Everything. After all, we *know* that the stuff being discarded is coming from a broken router ... we just don't know how broken that it is. Why gamble with the backbone by assuming that "hey, it's broken, but the brokenness doesn't extend to sending wrong but correctly formed advertisements". (Whether or not discarding everything, then bring the session right back up, downloading routes, eventually getting to themalformed one, and repeating the process is a good idea is a different question.) Do I wish my dual-homed routers would accept everything else and just ignore the bad route. Sure. But even if the "everything else" is crap, it's not going to get beyond the edge of my network, because I'm not a big provider and I have no BGP anywhere except the edge, and I don't pass what I receive via BGP beyond my network (i.e. I only advertise my routes). Do I think UUnet should propogate decent-looking routes that it got from a known-to-be-broken neighbor and pass then through its core and on to it's peers and BGP-connected transit customers? Probably not. The penalty for passing on a bogus information is too high. "Be conservative in what you accept" might suggest that the session should be kept up and the one bad advertisement be discarded, but "Be conservative in what you send" would tend to argue for never, under any ciscumstances, passing on routes that you received from a known-to-be-broken router. And, of course, there's the lack of data issue. There have been at least three signifigant outages that were the result of BGP flappage caused by malformed AS paths. In three cases, it is generally believed that the Internet would have been better off had the BGP sessions stayed up and just hte one malformed advertisement been discarded. Should we therefore change the protocol? We don't know, because we don't know how many times the "if it sends someting bogus, assume it's seriosuly broken and discard everything" rule has saved the Internet from a signifigant outage. At a minimum, if we're going to change the RFC based on the notion that "single malformed advertisements from otherwise functional routers" are signifigantly more prevalant that "routers that advertise a bunch of crap, some of which is correctly formed", we should at least have some data suggesting that it is in fact more prevalant. (We should also consider the consequences of getting this one wrong.) -- Brett