IP Fast Reroute , LFA (Loop Free Alternate) , Remote LFA and in general recovery and protection discussion. In this post, I will share the discussion with one of my slack group member, Driss Jabbar. He is a CCDE and highly skilled network engineer and also author of some posts in this website. You can contact him on Linkedin. Read more
Fast Convergence and the Fast Reroute Network reliability is an important design aspect for deployability of time and loss sensitive applications. When a link, node or SRLG failure occurs in a routed network, there is inevitably a period of disruption to the delivery of traffic until the network reconverges on the new topology.
Fast reaction is essential for the failed element for some applications. There are two approaches for the fast reaction in case of failure:
Fast convergence and fast reroute. Although people use these terms interchangeably, they are not the same thing.
In this post I will explain the definitions and high level design considerations for fast convergence and the fast reroute.
Fast Reroute mechanisms in IP and MPLS , design considerations and pros and cons of each one of them will be explained in a separate post.
When a local failure occur four steps are necessary for the convergence. These steps are completed before traffic continues on the backup/alternate link.
1. Failure detection (Protocol Hello Timers , Carrier Delay and Debounce Timers, BFD and so on)
2. Failure propagation (LSA and LSP Throttling timers)
3. New information process (Backup/Alternate path calculation) (SPF Wait and Run times)
4. Update new route into RIB/FIB (After this step, traffic can continue to flow through backup link)
For fast convergence, these steps are tuned. Tuning the timers mean generally lowering them as most vendors use higher timers to be on the safe side. Because as you will see later in this post, lowering these timers can create stability issue in the network.
When you tune the timers for failure detection, propagation and the new path calculation, it is called fast convergence. Because traffic can continue towards alternate link faster than regular convergence since you use lower timers. (Instead of 30seconds hello timer, you can use 1 second hello , or instead of 5 seconds SPF wait time, you can make it 10 ms and so on.)
Although the RIB/FIB update is hardware dependent, the network operator can configure all the other steps.
One thing always needs to be kept in mind; Fast convergence and fast reroute can affect network stability. If you configure the timers very low, you might see false-positives.
Unlike fast convergence, for the fast reroute, backup path is pre-computed and pre-programmed into the router RIB/FIB. This increases the memory utilization on the devices.
There are many Fast Reroute mechanisms available today. Most known ones are; Loop Free Alternate (LFA), Remote Loop Free Alternate (rLFA), MPLS Traffic Engineering Fast Reroute and Segment Routing Fast Reroute.
Loop Free Alternate and the Remote Loop Free Alternate if also known as IP or IGP Fast Reroute Mechanisms. Main difference between MPLS Traffic Engineering Fast Reroute and the IP Fast Reroute mechanisms are the coverage.
MPLS TE FRR can protect the any traffic in any topology. IP FRR mechanisms need the physical topology of the networks to be highly connected.
Ring and square topologies are hard for the IP FRR topologies but not a problem for MPLS TE FRR at all. In other words, finding a backup path is not always possible with IP FRR mechanisms if the physical topology is ring or square. Best physical topologies from this aspect is full mesh.