BGP next-hop-self behavior - One of my CCIE Enterprise students asked a question about BGP next hop in the MPLS VPN network. So, I would be very pleased to explain the BGP next hop self behavior both in IP and MPLS networks in this post. I am explaining this topic in deep detail in my “BGP Zero to Hero" course.
BGP Next Hop Self in IP Networks
Let's start with this IP network shown below (Figure-1) to explain BGP Next Hop Self process.
Figure-1 IBGP Next-Hop-Self handling in IP networks
In Figure-1, there is no MPLS service in the network. What's more, R1 and R2 are running IBGP with R3. And R3 is running EBGP with an upstream provider. When R3 sends the BGP prefix to R1 and R2, BGP's next hop is unchanged. The link between R3 and the Internet is set as BGP next hop.
In other words, if you examine the BGP routing table of R1 and R2, the next hop of the BGP prefixes coming from the Internet is the R3-Internet link. Further, routers need to find IGP (OSPF, IS-IS, EIGRP) next hop in order to send the packets to the destination. The link between R3 and Internet (External link), is not known by the IGP protocol. That link can be redistributed to IGP or it can be set as IGP passive interface.
If you don't want to see external routes in your IGP, then BGP next-hop can be set to the router's loopback, an internal route. In order to set the BGP next-hop to the router's loopback, you can create a route map on R3 to set the next-hop as its loopback interface, or you can set the BGP next-hop independently and create an IBGP session between Router's loopbacks. BGP sources interface, in this case, are R1, R2, and R3's loopback.
As you can see, if there is no MPLS VPN service, the prefixes – which are received from EBGP – are advertised to IBGP neighbors without changing the next hop. If the external link is not wanted in the network, manual operation is required on the edge router to set the next hop to it. Important to know that, if the external link is not set as next-hop, in case that link fails, traffic is blackholed. (Dropped at that router) until the BGP control plane is converged. BGP PIC Edge solves this problem by installing an alternate route in the forwarding table.
BGP Next Hop Self in MPLS Networks
Let's take a look at the MPLS VPN network and see how the BGP next-hop-self operation is done.
Figure -2 MPLS Network
Figure 2 shows the MPLS component, so let's examine MPLS Layer 3 VPN service. MPLS Layer 3 VPN requires the PE router to be the neighbor of Layer 3 with the CE routers. It can be a static route, RIP, EIGRP, OSPF, IS-IS, or BGP. IP prefixes are received from the CE routers and PE appends RD (Route Distinguisher) to the IP prefixes. And completely new VPN prefixes are created. (IPv4+RD=VPNv4) PE routers re-originate all the customer prefixes regardless of their origin, static redistribution, and PE-CE OSPF/IS-IS/EIGRP/BGP as well advertising all MP-IBGP peers by setting the BGP next-hop to it.
As for the IP network, you don't need to do the manual operation. MP-BGP neighborship between the PE routers should be created between their loopbacks. And in that case, loopback is set as next-hop without configuring BGP next-hop automatically. BGP Next hop self is already an automated process in MPLS VPN, but you don't want to advertise external interfaces (PE – CE) to IGP for scalability and stability reasons. Scalability would be affected because many customer interfaces would be advertised on IGP, and IGP wouldn't be scaled.
And it affects stability because whenever the interface flaps, it would cause SPF or DUAL algorithm to run. You may ask how SP can monitor those interfaces in MPLS VPN. Those interfaces are placed in Network Management VRF and carried to the Network Management System through MP-BGP. BGP Next Hop Self in IP and MPLS networks explained in this short blog post, you can find dozens of BGP-related blog posts on the website!. BGP Next Hop Self works on Cisco, Juniper, Nokia, and many other big vendor devices in the same way!.