BGP Best External is used in Active Standby BGP Topologies generally but not limited with that.BGP Best External feature helps BGP to converge much faster by sending external BGP prefixes which wouldn’t normally be sent if they are not overall BGP best path.
There are BGP best internal, BGP best external and BGP Overall best path.
BGP Best external in active-standby scenarios can be used in MPLS VPN, Internet Business Customers, EBGP Peering Scenarios, Hierarchical large scale Service Provider backbone and many others.
But, How an active-standby scenario connection with BGP is created? In which situation do people use active-standby instead of active-active connection?
Let’s start with the below scenario.
Figure -1 BGP Active-Standby Path Selection Example
The first thing you should know is that the common reason for the active-standby or primary-backup links is one link is more expensive than the other. Cost doesn’t have to be a $$ cost only but also be based on latency, performance, and bandwidth.
In Figure-1: IBGP is running in the Service Provider network. Between R1, R2 and R3 there is an IBGP full mesh session.
R2 and R3 are connected to the customer network and there is EBGP is running between them. Since BGP Local Preference attribute is set on R3 as 200, R3 is used as an egress point. In this case, the best path in the Service Provider domain for this customer is R3 and it is advertised to R1 and R2.
Although R2 has a connection to the customer network, since the overall best path is the IBGP path, R2 doesn’t even send its connection to R3 and R1. This is against BGP RFC but almost all vendors implemented their BGP in this way.
Before we start the impact of this feature, and the interaction with and without BGP PIC, let’s remember how BGP would converge in case the primary link fails.
In case R3 to customer link fails, R2 can learn the failure through IGP or BGP. If BGP's next hop is the R3 loopback (It is always the case with MPLS Layer 3 VPN), when the external link fails, R2 cannot understand the failure from the IGP update. R2 in that case waits for BGP withdrawal and updates messages from R3. When the BGP update is completed, R2 install prefixes with its external path into the RIB and FIB.
Now let’s enable the feature on R2.
When this feature is enabled on R2, although the overall best path in BGP comes from the IBGP neighbor which is R3, R2 would send its best external path. Since R2 has only 1 external path, R2 would send its path to both R3 and R1.
Here is the trick. Implementations don’t install the best external path into the RIB and FIB of the routers unless BGP PIC is enabled. (Some vendors enable BGP PIC by default when BGP best external is enabled, Ex: Cisco)
Do you think this feature l is helpful without BGP PIC?
Yes actually. Since in that case, R3 wouldn’t wait for BGP update from R2, it would only install prefixes into the RIB and FIB, because prefixes would be received from R2 and installed in BGP RIB when best external is enabled.
If BGP PIC and also BGP best external is enabled on R3, then in case R3 external link fails, R3 would start to send the traffic towards R2 because prefixes would be installed in RIB and FIB with the backup flag.
You can think that this solves the issue. You think that in the case of the primary link fails, the secondary link immediately is used without packet loss. Actually No.
If it's a pure IP network then a micro loop occurs. Because when R3 starts sending the traffic towards R2 (BGP PIC is enabled), R2 doesn’t know yet that the external link of R3 failed. R2 sends the traffic back to R3 and R3 sends it back to R2 because both do the IP lookup for the BGP prefix.
In MPLS VPN it is solved if the VPN label allocation is done per prefix or per CE since R2 and R3, in that case, wouldn’t do the IP lookup but based on the inner (VPN) label, they would start to send the traffic towards the customer.
If VPN allocation is done per VRF, then in that case if two CEs are connected to R2, R2 has to do the IP lookup to distinguish the correct CE and because of IP lookup, R2 would send the traffic back to R3 and the micro loop would occur again.
So BGP best external and PIC in IP network will suffer from micro loop but instead of losing seconds or minutes for waiting for BGP to convergence, when IGP is tuned, the micro loop can be resolved in less than a second, because R2 would be notified about the R3’s external link failure as fast as possible.
Now let’s look at the other example to see how BGP best external works and how it will help for the convergence. Also, this example shows that you may not need BGP Add-path, BGP Shadow RRs/Shadow Sessions to send more than one path from Route Reflector in the specific topologies.
Figure -2 BGP Hierarchical Service Provider Backbone
Above topology was common in the past and still is used in some Service Provider networks.
POP and Core architecture without MPLS in the core, POP has Route Reflectors in the data path, for redundancy more than one Route Reflector and the routes are summarized at the Core to the POP boundary.
In Figure -2, for simplicity, there are only 3 POPs that are connected to the Core network. Each pop has two RRs which have full-mesh IBGP sessions between them. In the core, there is PE which is connected to the customer, and ASBR which is connected to the upstream provider and receives the BGP prefix. In the POP there is a full-mesh IBGP session as well.
Note that, there would be a second-level Hierarchy in the Core as well because when the number of POP locations grows, required full-mesh IBGP sessions between RRs would be too much.
For a given prefix, in this picture, we have two paths. Path1 from POP1 and Path 2 from POP3.
BGP best external in this topology can be enabled in two places. It can be enabled on the ASBRs and also Route Reflectors.
Let’s assume Local preference is set to 200 on ASBR in Pop1 and 100 on ASBR in Pop3. This makes ASBR in Pop1 is the overall BGP best path for the prefix.
If BGP best external is enabled only on the ASBRs but not on the Router Reflectors, then Route Reflectors in POP 1 and POP2 DOESN’T receive the best external path which is Path 2 from POP3.
But POP3 RR3-A and RR3-B does receive the overall best path which is Path 1 and the best external path which is Path 2 because simply the ASBR in POP3 sends the best external path to its RR which is RR3-A and RR3-B
Here, BGP Add-path could be used to sent the best external path from RR3-A and RR3-B to the POP 1 and POP2 Route Reflectors. But the problem with BGP Add-path, it requires every PE, ASBRs, and Route Reflector software and hardware upgrade.
Instead, this feature is enabled on Route Reflector as well. This allows RR3-A and RR3-B to send the best external path which is Path 2 to POP1 and POP2 RRs.
When we have the overall best path and BGP best external path on the RRs, in case the overall best path goes down, network convergence is greatly increased, especially when BGP PIC is used together with this feature on ASBRs and RRs.
For example, if traffic comes from POP2 which doesn’t have ASBR and needs to go to the prefix, RR2-A and RR2-B will have two paths in this case. One is an overall best path which is Path1 and another is a best external path which is Path2. Both paths would be installed in RRs RIB and FIB (BGP PIC is enabled in addition to BGP best external). In case Path 1 fails, since the best external path is already in the RIB and FIB, BGP PIC would just change the pointer to the best external BGP path and you wouldn’t even lose the packet.
Orhan Ergun, CCIE/CCDE Trainer, Author of Many Networking Books, Network Design Advisor, and Cisco Champion 2019/2020/2021
He created OrhanErgun.Net 10 years ago and has been serving the IT industry with his renowned and awarded training.
Wrote many books, mostly on Network Design, joined many IETF RFCs, gave Public talks at many Forums, and mentored thousands of his students.
Today, with his carefully selected instructors, OrhanErgun.Net is providing IT courses to tens of thousands of IT engineers.