I think it is time to write otherwise people will loose their money for nothing. Today I got a whatsapp message from someone who says ‘ I can’t join your Onsite CCDE training, is there a way to buy REAL scenarios Online ‘. Read more
Is MPLS mandatory for Traffic Engineering? Read more
What is VPWS , VLL , EoMPLS ? Read more
33% Discount – Limited seats !
On all CCDE Products
33% OFF On Below Products !
Discount is valid for both Online Instructor Led CCDE Training and In-Class Instructor Led Training.
I receive so many questions regarding Dubai Onsite bootcamp, this discount is valid for it as well.
Note : There is only 3 seats left for the Onsite bootcamp. You may not be able to register please contact immediately with email@example.com
Fast Convergence and the Fast Reroute Network reliability is an important design aspect for deployability of time and loss sensitive applications. When a link, node or SRLG failure occurs in a routed network, there is inevitably a period of disruption to the delivery of traffic until the network reconverges on the new topology.
Fast reaction is essential for the failed element for some applications. There are two approaches for the fast reaction in case of failure:
Fast convergence and fast reroute. Although people use these terms interchangeably, they are not the same thing.
In this post I will explain the definitions and high level design considerations for fast convergence and the fast reroute.
Fast Reroute mechanisms in IP and MPLS , design considerations and pros and cons of each one of them will be explained in a separate post.
When a local failure occur four steps are necessary for the convergence. These steps are completed before traffic continues on the backup/alternate link.
1. Failure detection (Protocol Hello Timers , Carrier Delay and Debounce Timers, BFD and so on)
2. Failure propagation (LSA and LSP Throttling timers)
3. New information process (Backup/Alternate path calculation) (SPF Wait and Run times)
4. Update new route into RIB/FIB (After this step, traffic can continue to flow through backup link)
For fast convergence, these steps are tuned. Tuning the timers mean generally lowering them as most vendors use higher timers to be on the safe side. Because as you will see later in this post, lowering these timers can create stability issue in the network.
When you tune the timers for failure detection, propagation and the new path calculation, it is called fast convergence. Because traffic can continue towards alternate link faster than regular convergence since you use lower timers. (Instead of 30seconds hello timer, you can use 1 second hello , or instead of 5 seconds SPF wait time, you can make it 10 ms and so on.)
Although the RIB/FIB update is hardware dependent, the network operator can configure all the other steps.
One thing always needs to be kept in mind; Fast convergence and fast reroute can affect network stability. If you configure the timers very low, you might see false-positives.
Unlike fast convergence, for the fast reroute, backup path is pre-computed and pre-programmed into the router RIB/FIB. This increases the memory utilization on the devices.
There are many Fast Reroute mechanisms available today. Most known ones are; Loop Free Alternate (LFA), Remote Loop Free Alternate (rLFA), MPLS Traffic Engineering Fast Reroute and Segment Routing Fast Reroute.
Loop Free Alternate and the Remote Loop Free Alternate if also known as IP or IGP Fast Reroute Mechanisms. Main difference between MPLS Traffic Engineering Fast Reroute and the IP Fast Reroute mechanisms are the coverage.
MPLS TE FRR can protect the any traffic in any topology. IP FRR mechanisms need the physical topology of the networks to be highly connected.
Ring and square topologies are hard for the IP FRR topologies but not a problem for MPLS TE FRR at all. In other words, finding a backup path is not always possible with IP FRR mechanisms if the physical topology is ring or square. Best physical topologies from this aspect is full mesh.
OSPF Best Practices
Understanding and using best practices is very important though may not be feasible in all networks due to budget , political or other technical constraints.
In this post I will explain the best practices on OSPF networks. This best practices come from my real life design and deployment experience , knowledge and lessons learned of 15 years of Enterprise, Service Provider and Mobile Operator networking background.
Before we start, I want to touch briefly on Topology and Reachability information in OSPF as I will use these terms many times throughout this post and you’ll see whenever you study network design.
Reachability information means, IP address and subnets on the devices and the links. Router loopbacks, and the links between the routers have an IP address and these information are exchanged between the routers in OSPF. This process is known as control plane learning.
Topology information means, connection between the routers, metric information , which router is connected to which one. With this information, routers find a shortest path tree in OSPF. Note that IS-IS uses the same process to find a shortest path for each destination but there is no topology information in EIGRP. In other words, EIGRP neighbors don’t send topology information to each other.
Another term which I will use throughout this post is single area design.
Single area OSPF design is also known as Flat OSPF design. Generally we refer OSPF Area 0 only (Backbone area) deployment. There is no second area, all the nodes are in the backbone area.
- Stub, Totally Stub, NSSA and Totally NSSA Areas can create sub optimal routing in the network.Because these are types prevent some information into an area. Whenever there is specific information in the routing table, optimal path can be found , whenever there is summarization (less reachability information in the routing table) suboptimal routing might occur.
- OSPF Areas are used for scalability. If you don’t have valid reason such as 100s of routers, or resource problems on the routers, don’t use multiple areas.
- OSPF Multi area design increases the network complexity. Complexity sometimes is necessary and not the bad thing but just aware that multi area design compare to single/flat OSPF area design is more complex as you need to place ABR in the correct place, dealing with the multi area design related problems such as MPLS Traffic Engineering and MPLS LSP issues.
- Two is company, three is crowded in design. Having two OSPF ABR provides high availability but three ABR is not a good idea. Unless you have a capacity requirement , I don’t recommend to have three links , nodes , logical entity and so on in the networks.
- ABR slows down the network convergence. Knowing this important, without ABR in single/flat OSPF design, there is no Type 1, Type 2 to Type 3 LSA generation, similarly Type 4 LSAs also regenerated from the Type 1 LSAs.
- Having separate OSPF area per router is generally considered as bad. You should monitor the routers resources carefully and placed as much routers as you can in one OSPF area.
- Not every router has powerful CPU and Memory, you can split up the router based on their resource availability. Low end devices can be placed in a separate OSPF area and that area type can be changed as Stub, Totally Stub, NSSA or Totally NSSA.
- Always look for the summarization opportunity, but know that summarization can create sub optimal routing. Sub optimal routing may not be a problem for some applications but some applications require very low delay , jitter and packet loss. Sub optimal routing increases a chance of delay (latency).
- Good IP addressing plan is important for OSPF Multi Area design. It allows OSPF summarization (Reachability) thus faster convergence and smaller routing table.
- Having smaller routing table provides easier troubleshooting. Dealing with less information decreases mean time to repair. Identifying the problem and fixing would be faster. Because there will be less routing prefixes in the routing table and the routing protocol databases so troubleshooting would be much easier and it would be probably manageable by the average skilled engineers.
- Having smaller routing table increases convergence time as well. Summarization reduces the routing table size that’s why provides faster network convergence.
- OSPF NSSA area in general is used at the Internet Edge of the network since on the Internet routers where you don’t need to have all the OSPF LSAs yet still redistribution of selected BGP prefixes are common.
- Topology information is not sent between different OSPF areas, this reduces the flooding domain and allows large scale OSPF deployment. If you have 100s of routers in your network, you can consider splitting the OSPF domain into Multiple OSPF areas. But there are other considerations for Multi Area design and will be explained in this chapter.
- Use passive interface as much as you can. Passive interface should be enabled if you don’t want to setup an OSPF neighborship.
- For very large scale OSPF design, transit subnets can be removed from the OSPF topology. This has been defined in RFC 6860. This feature is known as ‘ prefix suppression ‘ on Cisco routers. Removing these links reduces the routing table size thus increases the network convergence and makes troubleshooting easier.
- If there will be maintenance on the router which runs OSPF , ‘ max-metric router lsa ‘ should be enabled to remove the router from the topology without having packet loss. Actually router still stays in the OSPF topology but since it will advertise maximum metric in Type 1 LSA (Router LSA), traffic is not forwarded to it, if there is an alternate path. If there is no alternate path, even with the ‘ max-metric router lsa ‘ router receives network traffic.
Bu Turkce paylastigim ilk post olacak. Heyecanliyim. Ama daha cok , Turkiyede ve Turkce CCDE Egitimi verecek olmaktan dolayi heyecanliyim.
Takipcilerim bilirlerki 2 yildan fazla bir suredir Cisco CCDE Egitimi vermekteyim ve egitimlerime Dunyanin her yerinden 100 lerce kisi katilmistir.
Cogunlukla Online/Live olmakla birlikte, Amerikada, Dubai de , Afrika da , Qatar ve Avrupada Onsite egitimler de veriyorum.
Insanlar, basta network design ogrenmek amaciyla bu egitime katiliyorlar. Tabiki ogrendikleriyle birlikte CCDE Egitimini de gecmeleri mumkun oluyor. Iki yil icerisinde 30 dan fazla ogrencim CCDE numaralarini aldi bile. Read more
MPLS Layer 3 VPN Deployment
In this post I will explain MPLS Layer 3 VPN deployment by providing a case study. This deployment mainly will be for green field environment where you deploy network nodes and protocols from scratch. This post doesn’t cover migration from Legacy transport mechanisms such as ATM and Frame Relay migration as it is covered in the separate post on the website. Read more
MPLS Transport Profile (MPLS-TP)
Multi-Protocol Label Switching Transport Profile (MPLS-TP) is a new technology developed jointly by the ITU-T and the IETF. The key motivation is to add OAM functionality to MPLS in order to monitor each packet and thus enable MPLS-TP to operate as a transport network protocol.
Quality of Service Best Practices
What is best practice ? Below is a Wikipedia definition of best practice. This apply to education as well.
A best practice is a method or technique that has been generally accepted as superior to any alternatives because it produces results that are superior to those achieved by other means or because it has become a standard way of doing things, e.g., a standard way of complying with legal or ethical requirements.Always classify and mark applications as close to their sources as possible.
Although in real life designs we may not be able to follow best practice network design due to many constraints such as technical , budgetary or political constrains, knowing the best practices is very critical for network design in real life as well as in the exams.
Thus below are the general accepted Quality of Service Best Practices. I covered Quality of Service Best Practices and the many other technology best practices in the CCDE In-Depth which is my latest network design book.
- Classification and marking usually done on both ingress and egress direction but queuing and shaping usually are done on Egress.
- Ingress Queening can be done to prevent Head Of Line blocking. Other wise, queuing is done almost in any case at the egress interface.
- Less granular fields such as CoS and MPLS EXP (Due to number of bits) should be mapped to DSCP as close to the traffic source as possible. COS and EXP bits are 3 bits. Thus you can have maximum 8 classes with them. DSCP is 6 bits and 64 different classes can be used. Thus DSCP is considered as more granular. This knowledge is important because when MPLS Layer 3 and Layer 2 VPN is compared, MPLS Layer 3 VPN provides more granular QoS as it uses DSCP instead of COS (Class of Service bits which is carried in Layer 2)
- Follow standards based Diffserv PHB markings if possible to ensure interoperability with SP networks, enterprise networks or merging networks together. RFC 4594 provides configuration guidelines for Diffserv Service Classes.
- If there is real time, delay sensitive traffic, LLQ should be enabled. Because LLQ is always served before than any other queuing mechanism. When the traffic in LLQ is finished, the other queues are handled.
- LLQ is the combination of CBWFQ (Class based weighted fair queuing) and Priority Queuing.
- Enable queuing at every node, which has potential for congestion. For example in Wide Area Network edge node, generally the bandwidth towards wide area network is less than local area network or datacenter, thus WAN edge is common place of QoS queuing mechanism.
- Limit LLQ to 33% of link bandwidth capacity. Otherwise real time traffic such as voice can eat up all the bandwidth and other applications suffer in case of congestion.
- Enable Admission Control on LLQ. This is very important since if you allocated a bandwidth which can accommodate 10 voice call only, 11th voice call disrupts all 11 calls. Not only the 11th call. Admission control for real time traffic is important.
- Policing should be done as close to the source as possible.Because you don’t want to carry the traffic which would be dropped any way. (This is a common network design suggestion which I give my clients for security filters). This is one of the most important Quality of Service Best Practices.
- Do not enable WRED on LLQ. (WRED is only effective on TCP based applications. Most if not all real time applications use UDP, not TCP)
- Allocate 25% of the capacity for the Best Effort class if there is large number of application in the default class.
- For a link carrying a mix of voice, video and data traffic, limit the priority queue to 33% of the link bandwidth.
- Use WRED for congestion avoidance on TCP traffic. WRED is effective only for TCP traffic.
- Use DSCP based WRED wherever possible. This provides more granular implementation.
- Always enable QoS in hardware as opposed to software if possible. In the campus environment, you should enable classification and marking on the switches as opposed to routers. Switches provide hardware based Quality of Service.
- Because 802.1p bit (COS bits) is lost when the packet enters the IP or MPLS domain, mapping is needed. Always implement QoS at the hardware, if possible, to avoid performance impact.
- Switches support QoS in the hardware, so, for example, in the campus, classify and mark the traffic at the switches.
When it comes to multi domain or Inter datacenter communication, minimizing the broadcast traffic between the datacenters is an important scaling requirement.
Especially if you are dealing with millions of end hosts, localizing the broadcast traffic is critical to save resources on the network and the end hosts. Resources are bandwidth , CPU , memory and so on.
In this post I will mention how ARP cache is populated in OTV and EVPN technologies and the importance of ARP proxy function. Read more
This is a free webinar but requires registration and seats are limited thus please register immediately.
Webinar on Tuesday, February 28, 2017 7:00 PM – 8:30 PM AST. Read more
My February 2017 CCDE class is now over. The duration of the course was for 11 days and as usual it started with lots of advanced technology lessons. All the critical CCDE exam topics (IGP, BGP , MPLS and the other technologies) were covered in detail from the design point of view.
A minimum of 4 hours was spent each day. We had 50+ hours training this time; which helped and engaged the participants during the training. For example, many existing CCDE network engineers shared the CCDE Practical scenarios, CCDE exam tips and tricks and their strategies with the students.
Since I adjust and expand my course outline continuously; each time, students learn more about network design. There were many CCDE Practical Scenarios at the February class and those scenarios definitely will help the attendees in the CCDE Practical exam.
By the way, the next CCDE Practical exam is on 22 February 2017, in 2 days’ time!
I wish all the attendees including my students ‘good luck’ in the upcoming CCDE practical exam. Looking forward to share their feedbacks and the success stories in the future posts.
For the upcoming April Online and May Dubai CCDE Boot camp registration, please click here
As many of you know, I was born in Turkey. And unfortunately, the educational system of that country is very weak. And guess what: If you can’t afford to go to private school in Turkey, you may not be able to learn English in the government school. Read more
Do you need an LSP for MPLS ?
In this post, I will go through below topics. This is one of the points which network engineers struggle to understand as I have seen.
- What is an LSP (Label Switched Path) ?
- What was the purpose of having LSP in the first place?
- Do we need an LSP for MPLS and MPLS Applications such as 2547 VPNs ?
- MPLS over LSP vs. MPLS over IP Encapsulations
- MPLS VPN infrastructure in 2017
Should I use Cisco OTV for the Datacenter Interconnect? This question comes from not only from my students but also the companies which I provide consultancy.
I will not go through the OTV details, how it works, design recommendations etc. But let me remind you what is OTV and why OTV is used , Where it makes sense very briefly. Read more