Total 13 Blogs

Created by - Orhan Ergun

Why Core or Backbone is used in Networking?

Before we start explaining this question, let's note that these two terms are used interchangeably. Usually, Service Providers use Backbone, and Enterprise Networks use Core terminology but they are the same thing. Why Network Core is Necessary? The Key Characteristics of the Core, the Backbone part of the networks are: High-Speed Connectivity. Today it is 100s of Gigabit networks and is usually used as a bundle to increase the capacity. Bringing Internet Gateway, Access, Aggregation, and Datacenter networks together. It connects many different parts of the network, and glues together. Redundancy and High Availability are so important. Redundant physical circuits and devices are very common. Failure impact is so high in this module, compared to other modules Full Mesh or Partial Mesh deployment is seen mostly as these type of topologies provides the most amount of redundancy and the direct path between the different locations. Commonly known in the Operator community as Backbone or ‘P Layer Redundancy in this module is very important. Most of the Core Network deployments in ISP networks are based on Full Mesh or Partial Mesh. The reason for having full mesh physical connectivity in the Core network is that full mesh connectivity provides the most optimal network traffic and the shortest path between the two locations. But not every network can have full mesh architecture, because it is the most expensive design option. Instead, many operators connect their Core/Backbone locations in a partial mesh model. In the partial mesh physical connectivity model, all of the core locations are not connected to each other, instead only the Core POP locations which have high network traffic demand are connected together. Core/Backbone provides scalability to the Service Provider networks. Without this layer, many Aggregation layers are required to be connected to each other to provide end-to-end connectivity. This would be too costly and so many physical links are required to be provisioned. The Core layer reduces the number of circuit requirements between different Aggregation networks. If the cost is a concern and size is small and scalability is not critical consideration, then the network can be designed by collapsing Aggregation and Core networks and having only one layer.

Published - Wed, 25 May 2022

Created by - Orhan Ergun

BGP Next Hop Self in IP and MPLS Networks

BGP next-hop-self behavior - One of my CCIE Enterprise students asked a question about BGP next hop in the MPLS VPN network. So, I would be very pleased to explain the BGP next hop self behavior both in IP and MPLS networks in this post. I am explaining this topic in deep detail in my  “BGP Zero to Hero” course. Click here for our Special Offer. BGP Next Hop Self in IP Networks Let’s start with this IP network shown below (Figure-1) to explain BGP Next Hop Self process.   Figure-1 IBGP Next-Hop-Self handling in IP networks In Figure-1, there is no MPLS service in the network. What’s more, R1 and R2 are running IBGP with R3. And R3 is running EBGP with an upstream provider. When R3 sends the BGP prefix to R1 and R2, BGP's next hop is unchanged. The link between R3 and the Internet is set as BGP next hop. In other words, if you examine the BGP routing table of R1 and R2, the next hop of the BGP prefixes coming from the Internet is the R3-Internet link. Further, routers need to find IGP (OSPF, IS-IS, EIGRP) next hop in order to send the packets to the destination. The link between R3 and Internet (External link), is not known by the IGP protocol. That link can be redistributed to IGP or it can be set as IGP passive interface. If you don’t want to see external routes in your IGP, then BGP next-hop can be set to the router’s loopback, an internal route. In order to set the BGP next-hop to the router’s loopback, you can create a route map on R3 to set the next-hop as its loopback interface, or you can set the BGP next-hop independently and create an IBGP session between Router’s loopbacks. BGP sources interface, in this case, are R1, R2, and R3’s loopback. As you can see, if there is no MPLS VPN service, the prefixes – which are received from EBGP – are advertised to IBGP neighbors without changing the next hop. If the external link is not wanted in the network, manual operation is required on the edge router to set the next hop to it. Important to know that, if the external link is not set as next-hop, in case that link fails, traffic is blackholed. (Dropped at that router) until the BGP control plane is converged. BGP PIC Edge solves this problem by installing an alternate route in the forwarding table. BGP Next Hop Self in MPLS Networks Let’s take a look at the MPLS VPN network and see how the BGP next-hop-self operation is done.     Figure -2 MPLS Network Figure 2 shows the MPLS component, so let’s examine MPLS Layer 3 VPN service. MPLS Layer 3 VPN requires the PE router to be the neighbor of Layer 3 with the CE routers. It can be a static route, RIP, EIGRP, OSPF, IS-IS, or BGP. IP prefixes are received from the CE routers and PE appends RD (Route Distinguisher) to the IP prefixes. And completely new VPN prefixes are created. (IPv4+RD=VPNv4) PE routers re-originate all the customer prefixes regardless of their origin, static redistribution, and PE-CE OSPF/IS-IS/EIGRP/BGP as well advertising all MP-IBGP peers by setting the BGP next-hop to it. As for the IP network, you don’t need to do the manual operation. MP-BGP neighborship between the PE routers should be created between their loopbacks. And in that case, loopback is set as next-hop without configuring BGP next-hop automatically. BGP Next hop self is already an automated process in MPLS VPN, but you don’t want to advertise external interfaces (PE – CE) to IGP for scalability and stability reasons. Scalability would be affected because many customer interfaces would be advertised on IGP, and IGP wouldn’t be scaled. And it affects stability because whenever the interface flaps, it would cause SPF or DUAL algorithm to run. You may ask how SP can monitor those interfaces in MPLS VPN. Those interfaces are placed in Network Management VRF and carried to the Network Management System through MP-BGP. BGP Next Hop Self in IP and MPLS networks explained in this short blog post, you can find dozens of BGP-related blog posts on the website!. BGP Next Hop Self works on Cisco, Juniper, Nokia, and many other big vendor devices in the same way!.

Published - Mon, 11 Apr 2022

Created by - Orhan Ergun

BGP Route Reflector Clusters

BGP route reflectors, used as an alternate method to full-mesh IBGP, help in scaling. BGP route reflector cluster is used to provide redundancy in a BGP RR design. BGP Route reflectors and RR clients create a cluster. (Cluster = BGP RR + BGP RR Clients) I am explaining this topic in deep detail in our specialized BGP Zero to Hero course. In IBGP topologies, every BGP speaker has to be in a logical full mesh. So, every BGP router has to have a direct IBGP neighbor with each other. However, the route reflector is an exception. If you place a BGP Route Reflector, the IBGP router sets up a BGP neighborship with only the route reflectors. In this article, I will specifically mention the route reflector clusters and their design. For those who want to understand BGP Route Reflectors, I highly recommend my BGP Route Reflector in Plain English post. What is BGP Route Reflector Cluster-ID? Route Reflector Cluster ID is a four-byte BGP attribute, and, by default, it is taken from the Route Reflector’s BGP router ID. If two routers share the same BGP cluster-ID, they belong to the same cluster. Before reflecting a route, route reflectors append its cluster ID to the cluster list. If the route originated from the route reflector itself, then the route reflector does not create a cluster list. If the route is sent to an EBGP peer, RR removes the cluster list information. If the route is received from an EBGP peer, RR does not create a cluster list attribute. Why is BGP Route Reflector Cluster list used? A RR cluster list is used for loop prevention by only the route reflectors. Route reflector clients do not use cluster list attribute, so they do not know to which cluster they belong. If there are two Route Reflectors, Are the same or different cluster IDs better on the Route Reflectors? If RR receives the routes from an IBGP neighbor which has the same cluster ID, routes are discarded. Let’s start with the basic topology. BGP Route Reflector Cluster Same Cluster ID Figure-1 Route-Reflector uses same cluster-id In the diagram shown above in fig.1, R1 and R2 are the route reflectors, and R3 and R4 are the RR clients. Both route reflectors use the same cluster-ID. Green lines depict physical connections. Red lines show IBGP connections. Assume that we use both route reflectors as cluster ID 1.1.1.1 which is R1’s router ID. R1 and R2 receive routes from R4. R1 and R2 receive routes from R3. Both R1 and R2 as route reflectors appends 1.1.1.1 as cluster ID attributes that they send to each other. However, since they use the same cluster, they discard the routes of each other. That’s why, if RRs use the same cluster-ID, RR clients have to connect to both RRs. In this topology, routes behind R4 are learned only from the R1-R4 direct IBGP session by the R1 (R1 rejects from R2). Of course, the IGP path goes through R1-R2-R4, since there is no physical path between R1-R4. If the physical link between R2 and R4 goes down, both IBGP sessions between R1-R4 and R2-R4 go down as well. Thus, the networks behind R4 cannot be learned. Since, the routes cannot be learned from R2 (the same cluster-ID), if the physical link is up and the IBGP session goes down between R1 and R4, networks behind R4 will not be reachable either, but if you have BGP neighborship between loopbacks and physical topology is redundant, the chance of IBGP session going down is very hard. Note: Having redundant physical links is a common network design best practice. That's why the below topology is a more realistic one. What if we add a physical link between R1-R4 and R2-R3? BGP Route Reflector Clusters Same Cluster-ID with excessive redundancy Figure-2 Route-Reflector uses the same cluster-ID, physical cross-connection is added between the RR and RR clients In Figure-2 physical cross-connections are added between R1-R4 and R2-R3. Still, we are using the same BGP cluster ID on the route reflectors. Thus, when R2 reflects R4 routes to R1, R1 will discard those routes. In addition, R1 will learn R4 routes through direct IBGP peering with R4. In this case, the IGP path will change to R1-R4 rather than to R1-R2-R4. In a situation in which the R1-R4 physical link fails, the IBGP session will not go down if the IGP converges to the R1-R2-R4 path quicker than the BGP session timeout (By default it does). Thus, having the same cluster-ID on the RRs saves a lot of memory and CPU resources on the route reflectors even though link failures do not cause IBGP session drop if there is enough redundancy in the network. If we would use different BGP cluster IDs on R1 and R2, R1 would accept reflected routes from R2 in addition to the routes from direct peering with R4. Conclusion Orhan Ergun recommends the Same BGP Cluster ID for the Route Reflector redundancy if there is a resource issue on the Route Reflectors. If there is no resource problem, having a different Cluster ID provides faster convergence in some cases (Depending on the topology) Otherwise, Route reflectors would keep an extra copy for each prefix which wouldn’t be advertised to Route Reflector clients anyway. To have a great understanding of SP Networks, you can check our Service Provider Networks Design and Architecture Perspective Book.

Published - Sun, 10 Apr 2022

Created by - Orhan Ergun

Four necessary steps in routing fast convergence

When it comes to fast convergence, first thing that we need to understand what is convergence?   Convergence is the time between failure and the recovery. Link, circuits, routers, switches all eventually fails. As a network designers, our job is to understand the topology and whenever there is qrequirement, add backup link or node. Of course, not every network, or not every place in the network requires redundancy though. But let's assume, we want redundancy, thus we add backup link or node and we want to recover from the failure as quickly as possible, by hoping before Application timeout. But what is the time for us to say , this network is converging fast. Unfortunately, there is no numerical value for it. So, you cannot say, 30 seconds , or 10 seconds , or 1 second is fast convergence. Your application convergence requirement might be much below 1 second. Thus, I generally call ' Fast Convergence' is the convergence time faster than default convergence value. Let's say, OSPF on Broadcast media is converging in 50 seconds, so any attempt to make OSPF convergence faster than 50 seconds default convergence value is OSPF Fast Convergence on Broadcast media.There are in general 4 steps for making the convergence faster, so 4 steps for Fast Convergence.    Four necessary steps in fast convergence1. Failure detection Layer 1 Failure detection mechanisms: Carrier delay Debounce Timer Sonet/SDH APS timers Layer 3 Failure detection mechanisms: Protocol timers (Hello/Dead) BFD (Bidirectional Forwarding Detection) For the failure detection, best practice is always use Physical down detection mechanism first. Even BFD cannot detect the failure faster than physical failure detection mechanism. Because BFD messages is pull based detection mechanism which is sent and receive periodically, but physical layer detection mechanism is event driven and always faster than BFD and Protocol hellos.If physical layer detection mechanisms cannot be used (Maybe because there is a transport element in the path), then instead of tuning protocol hello timers aggressively, BFD should be used. Common example to this is if there are two routers and connected through an Ethernet switch, best method is to use BFD. Compare to protocol hello timers, BFD is much ligher in size, thus consumes less resource and bandwidth.     2. Failure propagation Propagation of failure throughout the network. Here LSP throttling timers come into play. You can tune LSA throttling for faster information propagation. It can be used to slow down the information processing as well. Also LSP pacing timers can be tuned for sending update much faster.   3. New information process Processing of newly arrived LSP to find the next best path. SPF throttling timers can be tuned for faster information process for fast convergence.   4. Update new route into RIB/FIB For fast convergence, these steps may need to be tuned. Although the RIB/FIB update is hardware dependent, the network operator can configure all other steps. One thing always needs to be kept in mind; Fast convergence and fast reroute can affect network stability. In both OSPF and IS-IS Exponential backoff mechanism is used to protect the routing domain from the rapid flapping events. It slows down the convergence by penalizing the unstable prefixes. Very similar mechanism to IP and BGP dampening.

Published - Fri, 07 Aug 2020

Created by - Orhan Ergun

Fast Convergence and Network Stability Considerations in Service Provider Network

Service Provider Network design and deployment is one of the most mysterious parts of Networking. Usually we don't see real life SP network design and deployment discussions. Fast Convergence and Network Stability is required in the Service Provider Networks. Specifically it is more important in the Core/Backbone Networks, compared to other places such as Aggregation or Access Networks. I discussed these topics with Mohamed Radwan, along with many other important considerations in the Service Provider Networks. At the end of the post, you will see the recording of our discussion. If you find this video useful, let me know in the comment section!   Below are some of the discussion topics during the session:   1. BFD and Layer 1 Failure Detection Mechanisms 2. Prefix Prioritization 3. IGP - BGP Sync 4.IGP - LDP Synch 5. LDP Session Protection 6. LSP Throttling Timers 7. Link, Node, SRLG Failure Cases and Convergence Steps 8. Next Hop Tracking and BGP Scanner 9. IBGP Minimum Route Advertisement Interval 10. EBGP Fast External Fallover[embed]https://www.youtube.com/watch?v=yG2pwOtiBo4&t=52s[/embed]  

Published - Fri, 07 Aug 2020

Created by - Orhan Ergun

Ask these questions before you replace any technology in your network !

If you are replacing one technology with the other, these questions you should be asking. This may not be the complete list and one is maybe more important than the other for your network , but definitely keep in mind or come back to this post and check before you replace one technology with another one ! Is this change really needed ? Is there a valid business case ? First and most important question, because we are deciding whether this change is absolutely necessary. If the technology which you will migrate won't bring any business benefit (OPEX, CAPEX , New Revenue Stream etc.) then existing technology should stay. This is true for the new software release on the routers as well. If there is no new feature which you need to use with the new software release and if there is no known bug that effects the stability of the network, having a longer software lifecycle is better than upgrading the software frequently. What is the potential impact to overall network ? New technology might require extra resource usage on the network. Can your network devices accommodate this resource usage growth ? Opposite is true as well. New technology might reduce the resource usage but at what cost ? In general , reducing the state in the network (Routing table , MAC address table , MPLS Label table etc.) creates suboptimal routing and black holing depends on the network topology. For example, if you replace your full mesh IBGP design to IBGP Route Reflector design, it reduces the overall resource usage on the network (Reduces the state on the routers) but creates suboptimal routing (Depends on the topology) What will be the migration steps ? In network design lifecycle , deployment steps are prepared by the designers. These steps are covered in Network Migration Plan document , if separate migration document is not asked by the customer , then in the Low Level Design document (LLD) , these steps are highlighted clearly. If migration steps is not executed in an order, you have longer network down time which cost money to the organization. Or migration operation completely might fail. In any migration document , rollback plan should be included too. So , if the migration can not be executed in a planned time, escape plan , rollback plan should be started as planned earlier. Is there a budget constraint ? Budget is always a real concern. Almost in any business. Why budget is important in technology migration ? Because new technology may not be known by the network engineers , and learning process might be necessary. Free learning resources are good but how much you can trust ? So, I always recommend people to take training in a structured way from a known network designers who follow the most recent updates in the industry , designed many networks from any scale (not just couple large scale) , recommended by the people who you trust in the industry. ( I spend time to write, let me do a little marketing ? ) Budget is a concern when you try to design large scale network , adding a new technology to the existing (brown field) network , merging and acquisition design , securing the network etc. When you migrate , ask yourself , do my network engineers in the company know/handle the new technology ? Do you need to buy additional hardware to accommodate the new technology ? I can expand this list. Let me know your comment in the comment box below. Did you recently migrate any technology in your network ? Would this post be helpful ?

Published - Wed, 27 Nov 2019

Created by - Orhan Ergun

BGP Optimal Route Reflection – BGP ORR

BGP Optimal Route Reflection provides Optimal Routing for the Route Reflector Clients without sending all available paths. I recommend you to read this post if you don't know about BGP Route Reflector. If you are reading this post, you probably know that BGP Route Reflector by default choses and advertise only best path to their clients, from their point of view. If higher level BGP attributes are the same for two paths (Longest match, Local pref, Origin, MED etc.) BGP Route Reflector chooses the best path based on IGP cost. Shortest cumulative IGP distance path is selected as best path and advertised to the RR Clients. When RR advertises the best path to the clients in this way, it can lead to suboptimal routing for the clients. Let's understand the suboptimal routing as well briefly. Sub optimality in almost all books which you read is considered as a cumulative IGP cost from source to destination. It is not based on delay, monetary cost, fiber mile distance, reliability of the path etc. Just based on shortest total/cumulative cost. Optimal Route Reflection is a IETF Draft but there are many vendor implementation as of 2019 Based on this solution, the RR will do the optimal path selection based on each client's point of view. It runs SPF calculation with their clients as the root of the tree and calculates the cost to the BGP next-hop based on this view. With ORR Route Reflectors location would be independent from the selection process of the best-path. Each ingress BGP border router can have a different exit point to the transit providers, for the same prefix. What does BGP ORR (Optimal Route Reflection) Achieve? Let's have a look at below topology to understand what BGP Optimal Route Reflection achieves. In the picture above, Blue, Green and Orange links are physical connections and dashed black lines are IBGP sessions. Numbers are representing the IGP cost from each device to the BGP next hops. Same BGP prefix is received from IGW1, IGW2 and IGW3 and advertised to RR. As you can see, RR chooses IGW3 due to shortest IGP cost and advertises to all PE devices, which are RR Clients. IGW3 is the shortest path only for PE3 as PE3 shortest path would choose IGW3 as the best path. IGW3 is not the best path for PE1 and PE2. With BGP Optimal Route Reflection, RR advertises IGW1 to PE1 , IGW2 to PE2, IGW3 to PE3 as the best path. Not IGW3 for all the PE devices. By sending only one path, RR provides Optimal Routing (Remember how we defined optimal routing above?) Without BGP Optimal Route Reflection, RR needs to advertise all three IGW to all PE devices, in turn, PE devices can calculate their shortest path to the BGP destinations. Obviously this can create resource issue on the PE devices. But would be only way to provide Optimal Routing to the clients. With BGP ORR, you can think that, processing requirement is moved from the RR Clients to RR itself. It doesn't only hold all necessary prefixes with multiple next-hops, but also needs to run SPF from each client point of view to calculate and advertise shortest path per client. BGP ORR Requirements: Link-state routing protocol is required in the network for the Route Reflectors to have a complete view of the network topology based on the IGP perspective. No changes are required to be done by the clients. ORR is applicable only when BGP path selection algorithm is based on IGP metric to BGP next hop, so the path will be the lowest metric for getting the Internet traffic out of the network as soon as possible How BGP ORR Works: With ORR, 1st step, the topology data is acquired via ISIS, OSPF, or BGP-LS. The Route Reflector will then have the entire IGP Topology, so it can run its own computations (SPF) with the client as the root. There could be as many rSPF(Reverse SPF) run based on the number of RR clients, which can increase the CPU load on the RR. So, a separate RIB for each of the clients/groups of clients is kept by the RR. BGP NLRI and next-hop changes trigger ORR SPF calculations. Based on each next-hop change, the SPF calculation is triggered on the Route Reflector. The Route Reflectors should have complete IGP view of the network topology for ORR, so a link-state routing protocol is required to be used in the network. OSPF/IS-IS can be used to build the IGP topology information. IGP is great for link state distribution within a routing domain or an autonomous system but for link state distribution across routing domains EGP is required. BGP-LS provides such capability at high scale by carrying the link state information from IGP protocols as part of BGP protocol messages. Route Reflectors keeps track of which route it has sent to each client, so it can resend a new route based on changes in the network topology (BGP/IGP changes reachability). The Route Reflector function is 1 process per route but the ORR function is 1 process per route per client router. ORR brings the flexibility to place the Route Reflector anywhere in the topology, which provides Hot Potato Routing, supports resiliency via ORR Groups, requires no support from clients and finally brings much better output when is used with BGP ADD-PATH. If you liked this post and would like to see more, please let me know in the comment section below. Share your thoughts so I can continue to write similar ones.

Published - Wed, 27 Nov 2019

Created by - Orhan Ergun

Route Redistribution Best Practices

Route Redistribution Best Practices - You need route redistribution for many reasons. In this post,the drivers for the route redistribution but more importantly the best practices for applying route redistribution will be explained in great detail. I am explaining this topic in deep detail in my  “BGP Zero to Hero” course. Click here for our Special Offer. Redistribution allows one domain information to be leaked to another one. This can be between the companies, same company but different routing protocol but in IS-IS case same routing protocol but different level. Redistribution comes with its cost, in this article I will explain the drivers for route redistribution. How really it works and best practices from the network design point of view will be covered. You may have partner network which uses different routing protocol than your network, although it doesn’t have to be different protocol. Better practice is to create a BGP neighborship between the two companies and provide the reachability to each other. By doing this, you continue to use separate IGP protocols in each site and start to use BGP for partner network routes. Biggest advantage of doing this; link flaps, route flaps, intentional or unintentional operator mistakes will not cause routing protocol convergence in your network. Best Practice : If you have multiple points where you redistribute routes, be aware of routing loops. You should use route-map, distribute-list, route-tag, some sort of filtering mechanisms. Best Practice : If you are running link state protocol as your IGP and you have broadcast segment, it is better to use Designated router and redistribution point ( ASBR ) as same router. In this way, you will reduce the amount of flooding. You may want to redistribute default route at the Internet edge into your IGP protocol. You need to be really careful while doing this.Use filtering mechanisms to allow only default route. If company is using IS-IS as an IGP and multi level of IS-IS is deployed , you may need to leak some routes from level 2 domain into level 1 domain.This operation is done by redistributing selective addresses from level 2 domain into level 1 domain.Why would I want to leak prefixes into level 1 ?   You have to leak loopback addresses of all PE devices from level 2 into level 1 domain if MPLS is enabled in the network. ( Seamless or Unified MPLS is exception ).In that case you learn remote PE addresses and label binding through BGP.   Enterprise might be receiving MPLS VPN service from Service Provider and they maybe using different PE-CE routing protocol than their IGP. In this case, redistribution is necessary. All above best practices applies in this case. VPN service providers need to redistribute customers routing protocol into Multi Protocol BGP if the customers is not using BGP as their PE-CE routing protocol.In this case Service Provider takes the routes from VRF, make them VPN route by adding RD and RT values and send over the SP network. Conclusion: Avoid redistribution if you can. It is easy to create routing loops and to manage it to avoid loop, might be too complex to configure. It’s Your Turn Do you have any bad experience with route redistribution? Share your experience and suggestions in the comments box below.

Published - Tue, 26 Nov 2019

Created by - Orhan Ergun

Quality of Service Best Practices

QOS Best Practices - What is best practice ? Below is a Wikipedia definition of best practice. This apply to education as well. A best practice is a method or technique that has been generally accepted as superior to any alternatives because it produces results that are superior to those achieved by other means or because it has become a standard way of doing things, e.g., a standard way of complying with legal or ethical requirements.Always classify and mark applications as close to their sources as possible.   Although in real life designs we may not be able to follow best practice network design due to many constraints such as technical , budgetary or political constrains, knowing the best practices is very critical for network design in real life as well as in the exams.   Thus below are the general accepted Quality of Service Best Practices. I covered Quality of Service Best Practices and the many other technology best practices in the CCDE In-Depth which is my latest network design book.   Classification and marking usually done on both ingress and egress direction but queuing and shaping usually are done on Egress. Ingress Queening can be done to prevent Head Of Line blocking. Other wise, queuing is done almost in any case at the egress interface. Less granular fields such as CoS and MPLS EXP (Due to number of bits) should be mapped to DSCP as close to the traffic source as possible. COS and EXP bits are 3 bits. Thus you can have maximum 8 classes with them. DSCP is 6 bits and 64 different classes can be used. Thus DSCP is considered as more granular. This knowledge is important because when MPLS Layer 3 and Layer 2 VPN is compared, MPLS Layer 3 VPN provides more granular QoS as it uses DSCP instead of COS (Class of Service bits which is carried in Layer 2) Follow standards based Diffserv PHB markings if possible to ensure interoperability with SP networks, enterprise networks or merging networks together. RFC 4594 provides configuration guidelines for Diffserv Service Classes. If there is real time, delay sensitive traffic, LLQ should be enabled. Because LLQ is always served before than any other queuing mechanism. When the traffic in LLQ is finished, the other queues are handled. LLQ is the combination of CBWFQ (Class based weighted fair queuing) and Priority Queuing. Enable queuing at every node, which has potential for congestion. For example in Wide Area Network edge node, generally the bandwidth towards wide area network is less than local area network or datacenter, thus WAN edge is common place of QoS queuing mechanism. Limit LLQ to 33% of link bandwidth capacity. Otherwise real time traffic such as voice can eat up all the bandwidth and other applications suffer in case of congestion. Enable Admission Control on LLQ. This is very important since if you allocated a bandwidth which can accommodate 10 voice call only, 11th voice call disrupts all 11 calls. Not only the 11th call. Admission control for real time traffic is important. Policing should be done as close to the source as possible.Because you don’t want to carry the traffic which would be dropped any way. (This is a common network design suggestion which I give my clients for security filters). This is one of the most important Quality of Service Best Practices. Do not enable WRED on LLQ. (WRED is only effective on TCP based applications. Most if not all real time applications use UDP, not TCP) Allocate 25% of the capacity for the Best Effort class if there is large number of application in the default class. For a link carrying a mix of voice, video and data traffic, limit the priority queue to 33% of the link bandwidth. Use WRED for congestion avoidance on TCP traffic. WRED is effective only for TCP traffic. Use DSCP based WRED wherever possible. This provides more granular implementation. Always enable QoS in hardware as opposed to software if possible. In the campus environment, you should enable classification and marking on the switches as opposed to routers. Switches provide hardware based Quality of Service. Because 802.1p bit (COS bits) is lost when the packet enters the IP or MPLS domain, mapping is needed. Always implement QoS at the hardware, if possible, to avoid performance impact. Switches support QoS in the hardware, so, for example, in the campus, classify and mark the traffic at the switches. QoS design should support a minimum of three classes: EF (Expedited Forwarding)DF (Default Forwarding/Best Effort) AF (Assured Forwarding) If company policy allows YouTube, gaming, and other non-business applications, scavenger class is created and CS1 PHB is implemented. CS1 is defined as less than best effort service in the standard RFC. On AF queues, DSCP-based WRED should be enabled. Otherwise, TCP synchronization occurs. WRED allows the packet to be dropped randomly and DSCP functionality allows packet to be dropped based on priority. Whenever it is possible, don’t place TCP and UDP traffic in the same queue, place them in a separate queues. If the requirement is to carry end to end customer marking over MPLS Service Provider network, ask Pipe Mode Diffserv tunnelling service from the service provider. Uniform mode changes the customer marking thus customer needs to remark the QoS policy at the remote site. This creates configuration complexity for the customer. This may not be the full list but definitely important and common Quality of Service best practices. If you want to discuss, add anything for this post, please share it in the comment box below.

Published - Tue, 26 Nov 2019