North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
MPLS routing loop
Has anyone out there deployed MPLS TE in a hybrid Cisco and Juniper environment using IS-IS? In the lab, we've come across an interoperability issue and we're wondering if anyone else has seen it and, more importantly, determined a work around. The issue is a routing loop caused by the differences in how IOS and JUNOS determine the best IP route when LSPs are present. Cisco has two approaches for assigning metrics to LSP tunnels: absolute and relative. Absolute metrics are, for the most part, independent of the underlying IGP metric. Relative LSP metrics are based on the underlying IGP - they change when the IGP metrics change. By setting an absolute metric on a Cisco MPLS tunnel, the metric applies not only to the path to the egress router, but to all paths downstream of egress router as well. So given A-B-C-D-E, with A (Cisco) having a tunnel to B with absolute metric m, A will have routes to B, C, D, and E all with metric m, no matter what the IGP link metrics are between B, C, D, or E. JUNOS behaves differently. Downstream IGP metrics are added to the tunnel, so route selection by A, if it were a Juniper, would consider B-C-D-E IGP metrics before installing the route. OK. So given this behavior, here's how the loop occurs: x y | | | | A B | | | | J-----------------C J = Juniper C = Cisco x and y are EBGP peers that advertise prefix z with the same BGP attributes. J-C has an IS-IS link metric of 1000. J-A has an IS-IS link metric of 10. C-B has an IS-IS link metric 5. Now build LSP Q (in both directions) between J and C, with LSP metric 3. This Cisco-originated metric is "absolute". Router A and B have EBGP sessions to x. J and C have BGP sessions to both A and B (full IBGP mesh). The route from C to A is through J with metric 3. The route from J to B is through C with metric 3 + 5 = 8. The Juniper, A, sees two routes to destination z with the determining factor being IGP distance. It has a metric of 10 to A and a metric of 8 to B, so it forwards the packets to C. The Cisco, C, sees two routes to destination z with the determining factor being IGP distance. It has a metric of 3 to A (because it doesn't add the downstream metric) and a metric of 5 to B. The Cisco forwards the packet to J. Voila. Loop. With a full LSP mesh of routers, you wouldn't see this problem, but there are reasons why you might not have a full mesh. - Some routers on a network might not support MPLS. - Link or protocol (e.g. RSVP) failures may cause a node to drop out of the mesh. - And then there's always misconfigurations. - Large providers may decide to have a core LSP mesh only, to minimize scaling complexity.