North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
Re: Links on the blink - what will/should mci & sprint do?
In message <[email protected]>, Mich ael Dillon writes: > On Sat, 18 Nov 1995, Sean Doran wrote: > > > > > | Sounds like there is a need for a good ip switch. Something simple, > > | very fast, and low cost that you can download "static" routes to. > > > > It's called an SSP. > > And the problem on the net isn't with SSP's. The problem is that the > routing tables are NOT static. Switching is working fine, but the size of > the routing tables (CIDRize or die!) and the constant change in the > routing tables are the problem. Note that CIDRizing also reduces the > amount of change in the routing tables by replacing a set of potentially > varying routes with an unvarying aggregate. > > Even building a mondo box to handle huge routing tables and lots of > changes is not enough to solve the problem because there there is also > the protocol problem whereby routers communicate these route changes to > one another. This limits the number of BGP peering sessions that are > practical. > > Of course, most people here already know this but for those who are > trying to understand what is going on, I hope my brief explanation helps. > > Michael Dillon Voice: +1-604-546-8022 > Memra Software Inc. Fax: +1-604-542-4130 > http://www.memra.com E-mail: [email protected] Actually, you don't have the problem quite right. The problem is not the sheer size of the routing table. The 64 MB RP has fixed that for quite a while. It is not the processing load associated with the route change. An RS6K can keep up easily if it doesn't have to page (enough RAM in the box), and so can a 68020 if it was allowed enough CPU time to do something. The problem is that when a large set of routes change, a large set of routes in the SSP are invalidated. This results in a large amount of traffic forwarded to the RP. The SSP is bludgenning the RP in order to tell it that it needs some cache entires updated. The RP then can't keep adjacencies up and more route change results, which can kill other routers. If it gets far enough out of hand, the instability can turn into a stable oscillation and you have a melted backbone. This is a consequence of the architecture and the cache design. I've been pointing out this for years. Now it blew up. This is very fixable and Cisco could even fix it without requiring everyone to throw out their Cisco 7000s. Just get rid of the cache completely and push full routing from the RP to the SSP! Curtis ps - This is my guess. Cisco or Sprint have not yet confirmed or denied this. Perhaps Sean or Tony would care to comment. ;-)