In this post we will discuss how hair-pining is occurring in some topologies in the Enterprise branch sites, when connecting to 2 Service Providers, while the 2 Service Providers are not directly connected to each other and don’t have any MPLS/VPN Inter-AS Option (A,B,C,..) with each other.
From the customer side, some of the remote sites are only using the 1’st service provider (SP-1) and some others are using the second service provider (SP-2), while the rest of them using both service providers as the transit network.
If you like this topic, you can learn more in our Layer 2 network design course!
For the branch sites that are using 2 service provider connections as their transit network to connect to their Head Quarters, they use redundant Edge Routers for this Dual Home connection, meaning that the 1’st Edge Router is connected to SP-1 and the 2’nd Edge router is connected to SP-2.
As shown in the diagram below, the customers Local Area Network is connected to these 2 Edge Routers via Layer 2 Access Switches, so the Edge Routers are acting as HSRP Gateways for the LAN users. SP1 is the high priority of the transport WAN for this Enterprise Network, so GW-1 will be the HSRP Active Gateway for the LAN users and GW-2 will be HSRP Standby. This means that all the customers LAN traffic reaches GW-1 as the gateway for sending traffic to the WAN, without the consideration of the destination traffic (end remote-site) , regardless of if they are connected to SP-1,SP-2 or even both .
Figure-1 : Customer Branch Site Edge Network
IP SLA probes can be used with HSRP so if the WAN transport fails, the Standby HSRP Edge Router can act as HSRP Active Gateway to send the traffic to remote destination. Based on the routing protocol of the Enterprise Network and the received route from the Remote-Site, the HSRP Active Router decides where to send the traffic. It can send directly via SP-1 network or it can send it to the other Edge Router (Edge GW-2) to be sent to the remote site via SP-2 network.
In the second case, the traffic would be send from Edge GW-1 to Edge GW-2 via the same interface that the traffic was received from the clients inside the LAN. This is referred to as HAIR-PINING.
Figure-2 : Outbound Traffic Pattern from the Branch-Site in case of Lan users GW on R1 and destination traffic on SP2
The solution to avoid sending the traffic out the same interface is to add an additional link between the 2 Edge Routers. This link is only used for router to router connection and an IGP is run between the 2 routers. This interface can be a physical interface or a sub interface configured on both routers using the current interface.
This traffic pattern is still sub optimal and is not the preferred method for a traffic pattern. Regardless of which SP the destination remote-site is connected to, the traffic will reach the HSRP Active router and then will be decided to route the traffic to the destination remote-site. If the remote-site is connected to SP-1 WAN transport, it will send directly from its uplink and if the remote-site is connected to SP-2, it will send to GW-2 to be send via SP-2 WAN transport.
Figure-3 : Outbound Traffic Pattern (after adding L3 link between GW-1&GW-2) in case of Lan users Gateway is GW-1 and destination traffic on SP2
One solution(maybe easiest solution out of several solutions) to avoid this suboptimal routing issue , is to use Layer 3 Distribution Layer switches before the Edge Gateways, so the L3 distribution switches will act as the gateway for the LAN users inside the branch office.
The Distribution Layer switch will be redundantly connected to the Edge Routers via Layer 3 connections, running IGP, and will receive the destination route from the Edge Routers, so regardless of the HSRP active/Standby rule that the L3 distribution switches will have, traffic will send from the branch site LAN to the destination remote site network without any hair-pining or suboptimal routing issue.
This solution requires extra CapEx and OpEx for the customer on the edge network design, and also more implementation and support. Only the suboptimal routing issue cannot force the customer to change the network topology, because other main factors are required that can be discussed later.
This solution was a simple solution out of several design solutions for solving the hair-pining and suboptimal routing issue on this case of topology. There are more advanced design solutions that will be discussed later, on the new articles.