I received an interesting comment to my last post on networkcomputing It was about Avaya’s SPB and how it served in the core of the network at the Sochi Olympics.
For those who are not familiar with acronym, SPB stands for Shortest Path Bridging and it is used for large scale bridging in the data center (Though it is not limited to datacenter environment).
Since the Idea behind of SPB is removing the Spanning Tree protocol and benefits from multipathing which can be easily achieved with layer 3 routing, with SPB,bridging can be implemented in a smarter way. Also better resiliency is achieved with SPB compare to the classical bridging.
In this post I will explain large scale bridging, layer 2 multipathing technologies, some vendor implementations such as Fabricpath of Cisco , SPB of Avaya. I will mention their pros and cons as well. Before going technical details of these technologies, let’s examine traditional/classical bridging, let’s see how large scale bridging problem is solved while examining resiliency and multipathing capabilities of each solution.
Bridging: Within the same subnet (Broadcast domain), data communication is done through bridging. MAC addresses are used to describe a unique device at this layer and data packet is called frame. Separation and segregation of traffic is done with 802.1q (Vlan)
One physical underlay network is used by many overlay networks thanks to VLAN (Virtual Local Area Network) technology. Scalability of bridging is determined with the VLAN ID field (12bits) within 4 byte VLAN header. 802.1q is the IEEE standard which supports VLAN on the Ethernet networks. VLAN ID field is 12 bit thus can scale up to 4096 vlans. Traditional bridging uses spanning tree as its control plane mechanisms thus resiliency, loop avoidance, reliability of traditional bridging is controlled by the spanning tree.
Problems with bridging :
Many problems can be defined for bridging regarding of which business requirement is a concern , but I will give specific business requirement to explain next solution which come out to solve this business requirement.
Business Requirement :
Imagine within a cloud, or maybe within service provider environment, 4096 VLAN ID can be easily consumed due to multi tenancy and/or multilayer application architecture.
IEEE 802.1ad Provider Bridging came out to address scalability limit of 4096 VLAN for the service provider , large scale bridging environment such as private, public cloud so on.
Also note that Provider Bridging is used in the Service providers as an access technology.
IEEE 802.1ad Provider Bridging:
First protocol to address vlan limitation of the large scale networks was 802.1ad Provider Bridging. Since a picture is worth a thousand words, I will explain all the technologies in this article with pictures without going all the technical details.
Figure-1 – Bridging,Vlans, Provider Bridging,Provider Backbone Bridging
Figure-1 shows the evolution of the Ethernet frame format.As you see traditional Ethernet frame with 802.1q header.
Provider bridging is known as QinQ or .1q tunneling which stacks VLANs and thus 16million theoretically segments can be used overall. Solution is achieved using two VLANs and classical use case for provider bridging is the service provider environment.
Service provider allocate a separate VLAN for each customer, but even for 4096 VLANs of a given customer, only one VLAN would be enough for customer separation. Of course customer may want to receive different service ( voice,internet,VPN )then different Vlans in the service provider is used to map customer traffic to service provider Vlans.
Scaling the number of Vlan is achieved via provider bridging, but as you can see Cust SA and Cust DA which is customer source and destination mac addresses respectively is the same for both bridging and provider bridging. All the nodes in the layer 2 domain of Service Provider ( Or in the case of datacenter, datacenter switches ) has to know all the customer Mac Address information.
This mean, although we can address 4K limitation of bridging, adding another VLAN tag with PB ( Provider bridging ) still source and destination mac addresses belong to inner frame which is the customer, not the service provider. Why this is important?
For the CLOS/Leaf and Spine architecture, leaf nodes encapsulate/decapsulate additional dot1q tag. Classical bridging rules are implies so source mac address learning , aging , flooding and learning doesn’t change. Also since leaf nodes do not encapsulate source MAC with the new IP or MAC header , inner MAC header is visible to the spine which can be considered as a core of the data center. ( This will not be a problem with SPBM, Trill or Fabricpath since the Customer MAC addresses are encapsulated in a different layer2 or layer 3 header in each solution)
This example can be extended to the service provider, at the edge of a network, Aggregation nodes of the service provider adds another VLAN in front of the customer VLAN, but still customer mac addresses visible to all the devices within the service provider’s layer 2 backbone. This creates scalability challenges for the switches on the path since all the devices have finite resources, this apply for the MAC address tables of the devices as well.(TCAMs are expensive .. )
Provider bridging, same as traditional bridging might spanning tree for loop avoidance, thus all the control properties of spanning tree affects bridging and provider bridging. These control properties can be defined as lack of layer 2 multi path capability,suboptimal forwarding path incase of failure, slow convergence and so on.
Problems with Provider Bridging:
Almost everything is the same with bridging except second VLAN ID for the scalability. Thus all the problems with bridging also implies to the provider bridging, namely spanning tree resiliency, lack of multipath, configuration complexity and so on. But with provider bridging , Vlan limitation is avoided while adding second VLAN ID in front of inner VLAN ID, and we saw how this can create a scalability problem on the core devices due to lack of aggregation or hiding.
You don’t have to use spanning tree as control plane when you use Provider Bridging. You can use G.8032,REP, Spanning Tree, MPLS TP, MPLS and so on. But If you use spanning tree as the control plane, all the spanning tree limitations regardless of PB still apply.
So to solve customer VLAN limitation problem while adding second VLAN tag, we can create a state problem (MAC, ARP so on) on the core devices.
This brings us to third solution which is 802.1ah Provider Backbone Bridging which is also known as Mac in Mac.
IEEE 802.1ah Provider Backbone Bridging:
Frame format is different , we hide the customer MAC addresses from the core of the network , and scale up to 16million Vlans (Vlan in Vlan ,QinQ ) same as provider bridging. Perfect, right?
Let’s not start that discussion before learning the other parameters as well , let me explain what is provider backbone bridging.
Inner MAC address is encapsulated within outer MAC address. For better explanation, I will use the above service provider example. Customer Ethernet frame is encapsulated within the Service Provider frame. So, MAC in MAC encapsulation comes from this reason. If you encapsulate X into Y , you can get an abstraction.You can hide ” X ” from the intermeadite devices, so core can be more scalable same as MPLS layer 3 VPN which you might be familiar with. ( Smart edges, dumb core ).
Ethernet frame could be encapsulated into IP and still we could get an abstraction thus, MAC addresses could be hidden from the backbone.
Especially all the overlay networking mechanisms give us this ability. If we put all overlay technologies into two categories as Network Overlays and Host Overlays (VxLAN,NvGRE,STT,Geneve) and examine each one of them , we can see that overlay networks should not and do not create a scalability issue for the underlay physical networks.Overall complexity of the solution might increase but this is the topic of my network complexity article.
One of the reason to create an overlay network is separating complexity from the complexity, so without concerning about underlying physical network, overlay should give us the ability of changing , managing and deploying new technologies onto it.
This is same for the PBB, SPB which we will see at the next section, TRILL , Fabricpath, LISP, DMVPN, GetVPN, GRE so on. This is also the same for host based overlay such as Geneve, VXLAN, NVGre, STT. You encapsulate frames into an IP packet and hide the complexity from the core. Yes this can create a visibility problem for the core network but there is no free lunch!.
Figure-2 shows provider backbone bridging simplified frame format. Leaf nodes or service provider edge devices use backbone source and destination MAC for end to end communication unlike provider bridging. Backbone Vlan is to use for segregation of backbone into broadcast domains. I-TAG is used for service separation.
For those who are familiar with MPLS , let me give an analogy. Backbone Source and Destination MAC addresses can be loopback of the PE devices. BB Vlan might be transport label , I-TAG can be considered as VC label.
Very similar to layer 2 MPLS VPN service. You can scale , you can protect core of a network from the edge states thanks to hiding MAC address information from the core of a network. But layer 2 MPLS VPN do not use spanning tree at the control plane , at least at the core of a network , it relies on split-horizon mechanism for the overlay, underlay can use any routing protocol for topology creation.PBB may or may not use Spanning tree as its control plane.
PBB is a data plane encapsulation so control plane protocol choice is up to you.
If you use PBB-EVPN , you can carry layer 2 MAC address information between PE devices of the service provider while using only the PE mac addresses in the core. Control plane in the case of PBB-EVPN is Multi protocol BGP.
Conclusion for the provider backbone bridging:
It has the same characteristics of the provider bridging, such as PBB also uses customer and service provider VLANs , so you can get theoretically 16 million different segments, they may or may not rely on spanning tree for the control plane , but provider backbone bridging since it encapsulates customer MAC into a service provider MAC , customer MAC addresses can be hidden from the service provider core devices.
Problems with Provider Backbone Bridging:
If provider bridging frames use spanning tree for the control plane , so lack of layer 2 multipathing and poor resiliency properties of spanning tree applies. This lead us to the IEEE 802.1aq Shortest Path Bridging.
IEEE 802.1aq Shortest Path Bridging:
It uses IS-IS as an underlying control plane mechanism. Figure-3 can be used to explain shortest path bridging operation. Leaf and spine nodes, all runs IS-IS to advertise topology information to each other. Since IS-IS is a link state protocol and use Shortest Path First (SPF) alghrotim to calculate shortest path, shortest path bridging’s shortest path part comes from the alghoritm.
But unlike routing, large scale bridging only use IS-IS link state protocol for topology information , not the reachability information. This means MAC addresses are not advertised within IS-IS.
But implementation can use IS-IS to advertise also MAC address information since you just need additional TLV for this operation. Scalability of IS-IS for MAC addresses advertisement is questionable for large scale deployment thus BGP for MAC address distribution and IS-IS for physical topology creation might be a good option.Although AVAYA is using IS-IS for their SPB implementation for both topology and reachability information distribution.
Figure-3 SPB-M Frame
IS-IS is used on underlying physical network for layer 2 backbone and overlay multi tenant networks still use flood and prone learning mechanism which also called as data plane learning.
SPB has two flavors as depicted in the Figure-4 below. They are SPBV and SPBM. SPB for Vlan and SPB for MAC. SPB for Vlan is very flexible and can use traditional bridging , provider bridging frame formats for the data plane.
This mean, all of the frame formats depicted in Figure1 and can be used for SPBV. But since SPBV use IS-IS as a control plane protocol , layer 2 multipathing is achieved and protocol can be tuned for resiliency. Problem with SPB-V is very similar to provider bridging , which means all the nodes learn MAC addresses of end hosts thus scalability of core network is still a problem with SPBV.
Compare to provider bridging, since SPBV use IS-IS instead of spanning tree for topology creation , multipathing and shortest path to the destination is achieved.
Figure-4 SPB-V and SPB-M source : cisco.com
You might be asking, if I use PVST+ or Rapid PVST+ which might use separate trees for each Vlan, still all the paths in the network can be used. Yes this is correct but two things are the concern.
First; you need to carefully design which Vlan will be used on which link since spanning spanning tree will block the second link, this bring management complexity compare to single tree for all vlans and increase troubleshooting time due to complex configuration.
So with the spanning tree ( if you are not doing Link Aggregation ) you can have maximum VLAN based load balancing. But equal cost multi path or equal cost tress give you Flow Based load balancing.
Second, since second link will be standby and if first link goes down, reconvergence takes time and application traffic running over active link will be dropped during a convergence event. If multipathing is enabled , secondary link is also active and only the traffic which runs over primary link will be redirected to second link but this operation is very fast even operates at the software.
If multipathing is implemented at the hardware micro seconds level reconvergence can be achieved, if it is implemented at the software , still within couple milliseconds , traffic can continue over the second link.
As I mentioned earlier, other version of SPB is SPBM ( Figure-3 ) which is shortest path bridging Mac in Mac solution. It is very similar to provider backbone bridging at the data plane ( PBB encapsulation is used ) but very important different is shortest path bridging does not use spanning tree as its control plane.
Instead of spanning tree, IS-IS is used to built the topology in Shortest Path Bridging.
Thus SPB supports multipath bridging. Mac addresses are hidden from the core of the network. For data center leaf and spine architecture which is depicted in Figure-3, spine switches do not keep state for the mac addresses, do not know MAC addresses , thus overall scalability of the fabric can be much higher compare to the SPBV.
Mac address learning between leaf switches still is done at the data plane. Which mean broadcast layer 2 packets are flooded through spine switches. Multicast can be used for optimal flooding, also conversational MAC learning can be implemented for SPBM. Which mean unless destination MAC addresses are known, source MAC addresses are not learned from the incoming frame.
Alternatively, MAC learning can be implemented at the control plane. Which mean in addition to topology creation with IS-IS , IS-IS also can be used for overlaying topologies. This solution can create scalability issues since MAC addresses are not aggregatable. Better overlay solution can be implemented with BGP.
EVPN for example implements such a solution with MP-BGP.
It distributes end host reachability information over multi protocol BGP while underlying topology uses MPLS for the transport tunnel reachability. This will be the topic of another article.
Also I didn’t mention TRILL and FabricPath which is primarily target in Datacenter Fabric. Since this article is already lengthy, let me explain them in a separate article as well.
How About You ?
Which bridging technology are you using in your environment ?
What are the problems which you encounter ?
As always let’s discuss them in the comment box below.