Driss Jabbar 5 Comments

VXLAN EVPN – VxLAN is an overlay technology that encapsulates a Layer 2 frame into UDP header to extend your Layer 2 domain over a Layer 3 underlay infrastructure. Within the UDP header, there is VxLAN header, within this header you have a VxLAN Network identifier (VNI) represented by 24 bits, it means that you have more than 16 million logical networks (recall that you can configure up to 4096 VLANs only)

The idea behind developing such kind of technologies is to overcome some business and technical requirements in todays’ Datacenter architectures such as:

  • Datacenter interconnect, workloads live migration.
  • Using infrastructure resources efficiently and getting rid of any scalability issues related to Spanning tree or VLANs number.
  • Limiting resources consumption due to flooding symptom.

The initial version of VxLAN (RFC 7348) had no control plan, it relies on multicast to flood and learn VTEP and end host reachability. So I wasn’t so excited to learn about it, simply because it’s not scalable (My opinion).

To overcome the limitation and scalability issue related to flood and learn behavior, MP-BGP EVPN was integrated to VxLAN as control plan. The new address-family is called “L2VPN EVPN”.

With MP-BGP EVPN Control plane:

  • The Layer 2 and Layer 3 information learned locally from each VTEP switch are propagated to other VTEP switches that allow bridging and routing in the same fabric.
  • The routes are advertised between VTEPs using route-target policy, identical routes from different VRFs are separated using route distinguisher.
  • The Layer 2 informations are distributed between VTEPs and ARP informations are cached to minimize flooding.
  • The L2VPN EVPN sessions between VTEPs could be authenticated using MD5 password to mitigate any insertion of rogue VTEP in the network.

 

Distributed Anycast Gateway

FHRP protocols like HSRP, VRRP and GLBP were designed to make the gateway available all the time for hosts in the network. HSRP and VRRP use the same mechanism by electing one gateway as primary and the other as backup and both gateways share the same virtual IP (VIP) that will be used by hosts.

If those protocols work well in campus design, they represent some scalability limitations in datacenter, because HSRP and VRRP are limited to two gateways.

The Distributed Anycast gateway   overcome this limitation and all VTEPs will be configured with the same Mac and IP address making your VxLAN EVPNs appears like one Distributed fabric and allowing end host to use the optimal path for northbound traffic as well as seamless migration from one VTEP to other.

No more theory!!!

Let’s get our hands dirty on CLI and configure the VxLAN EVPN using the infrastructure represented in the diagram bellow:

 

VXLAN EVPN

Figure 1:Physical Infrastructure&IP addressing

The Lab is based on two spines called “SPINE-1 and SPINE-2” and four leaves called “LEAF-1 ,LEAF2, LEAF-3 and LEAF-4”.

LEAF-1 and LEAF-2 are connected to each other to form a VPC pair such as LEAF-3 and LEAF 4.

The configuration of VPC will not be covered in this document.

 

Leaf switches establish an IBGP EVPN session with Spines, those Spines are playing the role of EVPN route reflectors to exchange L2 and L3 informations between VTEPs and to minimize the number of IBGP sessions ( without route reflector, each VTEP has to establish an IBGP EVPN session to all VTEPs in the AS, leading to N(N-1)/2 IBGP EVPN sessions where N is Number of VTEPs).

 

Figure-2

Figure2 :BGP EVPN

The IBGP EVPN sessions are using loopbacks as source interfaces, it means that you need some kind of IGP protocol (OSPF, ISIS) to advertise VTEPs loopback interfaces to each others. In our LAB we will use OSPF.

Multicast still need to be activated in the underlay infrastructure to convey BUM traffic. we have used PIM sparse mode and announced the address for our Anycast RP statically in each VTEP.

VxLAN scenario:

Ok let’s assume that we are tasked to assign two VLANs for HR department, those VLANs need to communicate to each other and not to the rest of the network.

In the classical network that’s easy right! You need just to create those two VLANs and put them inside a VRF, and this is what we will try to do in our VxLAN scenario. We will create two VLANs that we will call GREY and GREEN, give each of them an IP address and put them together within a VRF that we will call “HR”

 

Table

Table 1: VxLAN and VNI association

Note: The L3 VNI VLAN is responsible of inter-VxLAN routing.

  1. The first step, you need to activate VxLAN and BGP features in NX-OS:

Configuration:

feature bgp

#activate bgp protcole that will be used for L2VPN EVPN address-familly

feature vn-segment-VLAN-based

#this feature allow you to map a VNI to a VLAN

feature nv overlay

!#this is VxLAN Feature

Other features need to be activated for your underlay infrastructure like:

feature ospf

feature pim

feature interface-VLAN

  1. Second step, we need to create our Green and Grey VLANs and associate a VNI to each of them as described in the table above and configure EVPN parameters:

Configuration:

VLAN 10

VLAN 10

name L3_VNI

vn-segment 10000

VLAN 20

name GREY

vn-segment 20000

VLAN 30

name GREEN

vn-segment 30000

evpn

!Evpn configuration permit the exchange of L2 reachability between VTEPs

evpn

vni 20000 l2

   rd 20:20000

   route-target both 20:20000

vni 30000 l2

   rd 30:30000

   route-target both 30:30000

  1. In the step number three, we will create our HR VRF and associate it to the L3 VNI:

Configuration:

vrf context HR

vni 10000

rd 10:10000

address-family ipv4 unicast

   route-target import 10:10000

   route-target import 10:10000 evpn

   route-target export 10:10000

   route-target export 10:10000 evpn

 

  1. In the fourth step, we will configure our distributed gateway virtual mac address, create our VLAN interfaces and activate the distributed gateway in these VLANs as bellow:

 

Configuration:

Fabric forwarding anycast-gateway-mac 0001.0001.0001

Interface VLAN 20

Description GREY

Vrf member HR

Ip address 192.168.20.1/24

No shutdown

Fabric forwarding mode anycast-gateway

 

Interface VLAN 30

Description GREEN

Vrf member HR

Ip address 192.168.30.1/24

No shutdown

Fabric forwarding mode anycast-gateway

 

Interface VLAN 10

Vrf member HR

No shutdown

! do not forget to activate your L3 VNI VLAN

 

  1. Step number five, we will configure our interface VxLAN tunnel NVE and associate L2 VNI and L3 VNI with it

Configuration:

interface nve1

no shutdown

source-interface loopback0

host-reachability protocol bgp

member vni 10000 associate-vrf

member vni 20000

   mcast-group 239.1.1.1

   ! Associate a multicast address to each VNI for BUM trafic

   suppress-arp

!suppress arp permit to VTEP to cache host-reachability information for remote VTEPs and behave later like a proxy-arp when it receives an ARP request from end host and the information is already in his cache table

member vni 30000

   mcast-group 239.1.1.2

   suppress-arp

 

  1. The last step is to enable BGP and establish IBGP EVPN session to RRs

 

Configuration:

VTEPs configuration:

router bgp 1

address-family ipv4 unicast

address-family l2vpn evpn

neighbor 192.168.1.1

   remote-as 1

   update-source loopback0

   address-family ipv4 unicast

   address-family l2vpn evpn

     send-community extended

neighbor 192.168.1.2

   remote-as 1

   update-source loopback0

   address-family ipv4 unicast

   address-family l2vpn evpn

     send-community extended

=============================================================

RR configuration

router bgp 65001

address-family ipv4 unicast

address-family l2vpn evpn

   retain route-target all

template peer IBGP-EVPN

   remote-as 65001

   update-source loopback0

   address-family ipv4 unicast

     send-community extended

     route-reflector-client

   address-family l2vpn evpn

     send-community extended

     route-reflector-client

neighbor 192.168.1.3

   inherit peer IBGP-EVPN

neighbor 192.168.1.4

   inherit peer IBGP-EVPN

neighbor 192.168.1.5

   inherit peer IBGP-EVPN

neighbor 192.168.1.6

   inherit peer IBGP-EVPN

 

To make VPC work within VxLAN, you have to assign a secondary IP address to the loopback interfaces, this address have to be the same between the two switches forming the VPC:

Configuration:

LEAF-3

interface loopback0

ip address 192.168.1.34/32 secondary

LEAF-4

interface loopback0

ip address 192.168.1.34/32 secondary

 

LEAF-5

interface loopback0

ip address 192.168.1.56/32 secondary

LEAF-6

interface loopback0

ip address 192.168.1.56/32 secondary

 

 

To perform tests, I have connected two physical routers and connected them to VPC switches using an Etherchannel interface as shown in the diagram bellow:

 

vxlan infrastructure

Figure 3: VxLAN infrastructure

Within each physical router, I have created two VRFs, one for VLAN 20 and the other for VLAN 30 to make sure that the Inter-VLAN communication passes throughout the VTEPs.

Let’s ping from router 1 to router 2 within VLAN 20 and 30 and Inter-VxLAN:

 

Figure-4

Figure 4: Ping VLAN 20

 

Figure-5

Figure 5: Ping VLAN 30

 

 

Figure-6

Figure 6: Ping Inter-VxLAN

 

The pings are successful !!

Let’s now go back and check how reachability informations are distributed from VTEPs point of view

Typing the “show bgp L2vpn evpn” command in VTEP-1 we can clearly see:

 

Figure-7

Figure 7: L2route evpn mac-ip

 

Here you can see clearly that the host addresses 192.168.20.2 in VLAN 20 and 192.168.30.2 in VLAN 30 are received from BGP and the next-hop address is the loopbacks’ secondary IP address that has been used for VPC between VTEP-3 and VTEP-4

The 192.168.20.3 and 192.168.30.3 are locally learned by means of ARP protocol and are saved in the HMM (Host mobility Manger) database before been sent to other VTEPs.

The “show bgp l2vpn evpn” command shows the BGP EVPN learns both MAC and MAC-IP informations in different entries and you can see the RD as well as the next hop associated to each entry.

 

Figure-8

Figure 8: BGP L2VPN EVPN /LEAF-1

Here is the same command typed in the RR (SPINE) router:

 

Figure-9

Figure 9: BGP L2vpn EVPN /RR

 

The VxLAN can communicate with external network using static routes or a dynamic routing protocol.

In this example I will use the BGP protocol, remember that if you have only one exit point, static route is more than enough.

 

Figure-10

Figure 10: VxLAN to External Networks

Here is the BGP configuration applied in VTEP-1 and VTEP 2

Configuration

VTEP-1

router bgp 65001

vrf HR

   address-family ipv4 unicast

     advertise l2vpn evpn

   neighbor 10.1.1.6

     remote-as 65000

     address-family ipv4 unicast

 

VTEP-2

router bgp 65001

vrf HR

   address-family ipv4 unicast

     advertise l2vpn evpn

   neighbor 10.1.1.6

     remote-as 65000

     address-family ipv4 unicast

 

As you can see the BGP configuration is within the HR vrf, and the command advertise l2vpn evpn is responsible of redistributing routes learned from external peer to VxLAN.

The external router bgp configuration is straightforward:

External router configuration:

router bgp 65000

bgp log-neighbor-changes

neighbor 10.1.1.1 remote-as 65001

neighbor 10.1.1.5 remote-as 65001

!

address-family ipv4

network 172.16.1.0 mask 255.255.255.0

neighbor 10.1.1.1 activate

neighbor 10.1.1.5 activate

exit-address-family

 

The external BGP router will receive all the host routes from VxLAN network, for scalability reason, think to do some filtering to allow only subnet or aggregate network to get outside.

 

Figure-11

 

Figure 11: VxLan networks from outside point of view

 

The network 172.16.1.0 is installed in the VTEP-3 and you can see that it comes from AS 6500 with a 10:1000 RD which is the value that has been given to L3-VNI

 

Figure-12

Figure 12 : External network inside EVPN

 

Let’s ping from router 1 to the 172.16.1.1

Figure-13

Figure 13 :ping form EVPN to external networks

Great!!!

VxLAN is a suitable protocol for scalable DC architecture but it’s complicated from configuration point of view mainly for network engineer who don’t have an experience with BGP.

Cisco provides a controller called Nexus Fabric Manager that will make the VxLAN implementation and administration easier.

http://www.cisco.com/c/en/us/products/collateral/cloud-systems-management/nexus-fabric-manager/solution-overview-c22-736688.html

 

 
0.00 avg. rating (0% score) - 0 votes
  • altalavista

    Great write-up Orhan, very timely. A lot of folks are looking at this as alternative to Fabricpath.

    I have one small Q:

    What’s the config like on the trunk port from LEAF to the routers? Are you tagging the L3 VNI VLAN 10? i.e. is spanning-tree up for VLAN10 towards the router?

    • Orhan Ergun

      @Altalavista, it is nice really but I did’nt write it. Author is Driss and as the other designers He is writing also here. I believe he will answer your question

    • Driss Jabbar

      Hi altalavista,

      Effectively, the link from the Leaf to the router is a trunk link with allowed vlan,20 and 30. but you have to make sure that the int vlan 10 is UP, in our case the vlan pass through the Peer-link.

      Cheers

  • sameer

    how abt Network Design / Architecture in ACI Mode ? along with DCI and Service Service Channing

  • Hispren

    Hello: I am not sure why multicast is still needed. The overall idea behind MPBGP EVPN is that some networks do not want to enable multicast and also it provides the control plane part.