IP Fast Reroute , LFA (Loop Free Alternate) , Remote LFA and in general recovery and protection discussion. In this post, I will share the discussion with one of my slack group member, Driss Jabbar. He is a CCDE and highly skilled network engineer and also author of some posts in this website. You can contact him on Linkedin. Read more
Is Inter-AS MPLS VPNs commonly deployed ? In real-life deployment which Inter-AS MPLS VPN Option is most common ? What are the use cases of Inter-AS MPLS VPNs ? This is not a theory post , I will share practical information with you. Read more
OSPF Prefix Suppression helps to company to use 200 routers in their network without any problem. You can think that, some companies use more than 200 routers in their OSPF network, why this post is special? You will understand why in 10 minutes. Read more
What is KISS Principle ? Okay it stands for Keep it Simple and Stupid but what does really it mean in networking ?
Flat OSPF network, or single area OSPF networks are real. In fact most of the OSPF network today deployed, is flat OSPF networks. But how many routers can be placed safely in an OSPF area ? Any number from the real world OSPF deployment ? I will share in this post.
OSPF Best Practices
Understanding and using best practices is very important though may not be feasible in all networks due to budget , political or other technical constraints.
In this post I will explain the best practices on OSPF networks. This best practices come from my real life design and deployment experience , knowledge and lessons learned of 15 years of Enterprise, Service Provider and Mobile Operator networking background.
Before we start, I want to touch briefly on Topology and Reachability information in OSPF as I will use these terms many times throughout this post and you’ll see whenever you study network design.
Reachability information means, IP address and subnets on the devices and the links. Router loopbacks, and the links between the routers have an IP address and these information are exchanged between the routers in OSPF. This process is known as control plane learning.
Topology information means, connection between the routers, metric information , which router is connected to which one. With this information, routers find a shortest path tree in OSPF. Note that IS-IS uses the same process to find a shortest path for each destination but there is no topology information in EIGRP. In other words, EIGRP neighbors don’t send topology information to each other.
Another term which I will use throughout this post is single area design.
Single area OSPF design is also known as Flat OSPF design. Generally we refer OSPF Area 0 only (Backbone area) deployment. There is no second area, all the nodes are in the backbone area.
- Stub, Totally Stub, NSSA and Totally NSSA Areas can create sub optimal routing in the network.Because these are types prevent some information into an area. Whenever there is specific information in the routing table, optimal path can be found , whenever there is summarization (less reachability information in the routing table) suboptimal routing might occur.
- OSPF Areas are used for scalability. If you don’t have valid reason such as 100s of routers, or resource problems on the routers, don’t use multiple areas.
- OSPF Multi area design increases the network complexity. Complexity sometimes is necessary and not the bad thing but just aware that multi area design compare to single/flat OSPF area design is more complex as you need to place ABR in the correct place, dealing with the multi area design related problems such as MPLS Traffic Engineering and MPLS LSP issues.
- Two is company, three is crowded in design. Having two OSPF ABR provides high availability but three ABR is not a good idea. Unless you have a capacity requirement , I don’t recommend to have three links , nodes , logical entity and so on in the networks.
- ABR slows down the network convergence. Knowing this important, without ABR in single/flat OSPF design, there is no Type 1, Type 2 to Type 3 LSA generation, similarly Type 4 LSAs also regenerated from the Type 1 LSAs.
- Having separate OSPF area per router is generally considered as bad. You should monitor the routers resources carefully and placed as much routers as you can in one OSPF area.
- Not every router has powerful CPU and Memory, you can split up the router based on their resource availability. Low end devices can be placed in a separate OSPF area and that area type can be changed as Stub, Totally Stub, NSSA or Totally NSSA.
- Always look for the summarization opportunity, but know that summarization can create sub optimal routing. Sub optimal routing may not be a problem for some applications but some applications require very low delay , jitter and packet loss. Sub optimal routing increases a chance of delay (latency).
- Good IP addressing plan is important for OSPF Multi Area design. It allows OSPF summarization (Reachability) thus faster convergence and smaller routing table.
- Having smaller routing table provides easier troubleshooting. Dealing with less information decreases mean time to repair. Identifying the problem and fixing would be faster. Because there will be less routing prefixes in the routing table and the routing protocol databases so troubleshooting would be much easier and it would be probably manageable by the average skilled engineers.
- Having smaller routing table increases convergence time as well. Summarization reduces the routing table size that’s why provides faster network convergence.
- OSPF NSSA area in general is used at the Internet Edge of the network since on the Internet routers where you don’t need to have all the OSPF LSAs yet still redistribution of selected BGP prefixes are common.
- Topology information is not sent between different OSPF areas, this reduces the flooding domain and allows large scale OSPF deployment. If you have 100s of routers in your network, you can consider splitting the OSPF domain into Multiple OSPF areas. But there are other considerations for Multi Area design and will be explained in this chapter.
- Use passive interface as much as you can. Passive interface should be enabled if you don’t want to setup an OSPF neighborship.
- For very large scale OSPF design, transit subnets can be removed from the OSPF topology. This has been defined in RFC 6860. This feature is known as ‘ prefix suppression ‘ on Cisco routers. Removing these links reduces the routing table size thus increases the network convergence and makes troubleshooting easier.
- If there will be maintenance on the router which runs OSPF , ‘ max-metric router lsa ‘ should be enabled to remove the router from the topology without having packet loss. Actually router still stays in the OSPF topology but since it will advertise maximum metric in Type 1 LSA (Router LSA), traffic is not forwarded to it, if there is an alternate path. If there is no alternate path, even with the ‘ max-metric router lsa ‘ router receives network traffic.
OSPFv2 by default setup only one adjacency over a single link. But this can be an issue some time and as a network designer you should understand the consequences and know the available solutions.
Placing a link in wrong OSPF area can create an OSPF sub optimal routing especially in hub and spoke topology.
In IS-IS or OSPFv3, this wouldn’t be an issue because IS-IS and OSPFv3 allow link to be placed in more than area or level. (In IS-IS, area is assigned to the router, not to the link. Thus I use level keyword)
Let’s look at the below network to understand what is the issue and how OSPF multi-area adjacency can solve the sub optimal routing problem. Read more
Why are dynamic routing protocols used is usually asked by newbies in the networking field, especially after they have heard about routing protocols. Besides that, they often asked this question: What is the difference between static routing and the dynamic routing protocols?
And the common answer is that dynamic routing protocols are scalable.
In other words, there is no need to configure a manual entry for each destination as well as specifying the next hop IP address or interface with the dynamic routing protocols.
These are good reasons. But do we really have only such benefits? In very small networks, scalability is reasonable and correct. But for more sophisticated networks, there are other important reasons.
Before I explain the other reasons, let me clarify why static routing requires lots of manual configurations and why it is not scalable, compared to dynamic routing protocols.
Figure- 1 Why are dynamic routing protocols used?
OSPF Area Types – Different Areas in OSPF are used to create smaller fault domains. There are two OSPF area types in total.
OSPF Backbone area and OSPF non-backbone area
Backbone area in OSPF is Area 0. OSPF prevents loop by using backbone area concept.All the non-backbone areas should be connected to the Backbone area.
There are many Non-Backbone OSPF Area types. These are; Normal Area, Stub, Totally Stub, NSSA and Totally NSSA Areas.
In this article I will explain the non-backbone OSPF areas from the design point of view and share some caveats about the OSPF design.
OSPF LSA (link state advertisements) are used to create a logical network topology. But Why we have 11 different LSAs ? What are their purposes ? Most important questions many time is not asked by the engineers thus you can’t find many places on the Internet which provides these answers.
The reason of having 11 Type of OSPF LSA packet is Scalability. If the network only consist of small amount of routers ( routers, link and the physical topology is important to define the size ) then you would have maximum two type of LSA.
Let me explain the ospf lsa types and why we would only have maximum two type of LSA will be understood.
Note : In this article only OSPFv2 LSA Types explained.
EIGRP vs OSPF – Below comparison table is your primary resource for the OSPF and EIGRP routing protocols when you compare them from the design point of view.
Knowing and understanding these design practices will not only help you for the real life network design but also will help for the any design certification exams.
If you have any question regarding the parameters in the comparison chart, please share in the comment so I can provide more information.
Flooding in full-mesh topology is a big concern for network-design experts, especially in large-scale OSPF deployments. When the link or node fails in an OSPF network, failure information is flooded everywhere in the same area. If Flat OSPF network design is used, then the problem gets bigger. Each router receives at least one copy of the new information from each neighbor.
Understanding everything about routing design is no brainer, especially if you have the chart below on your wall.
The table below highlights the pros and cons of each routing protocol. Of course, you need to consider the design attributes shown in Figure A before embarking on routing design.
Should you like the comparison of the routing protocols illustrated in the table below or should you want to see similar comparison for other technologies, feel free to add your comment in the comment section.
Another boon for all my readers!
If you are interested in network design or considering CCDE, CCDP, or CCDA certification, you can subscribe for membership here so that you can peruse all the design resources (Videos , Tests , Case Studies , and E-Books).
Figure A: Comparison of Routing Protocols
OSPF Design – In the below picture, where should you place an OSPF ABR (Area Border Router) to scale OSPF design ? Why ?
Please share your thoughts in the comment box below.
First 5 correct answers will get my CCDE Preparation Workbook for free. Please subscribe to email list so I can see your email address for communication.
OSPF and MPLS is most commonly used two technologies in an MPLS VPN environment.
In this post I will share a mini design scenario with you and ask couple questions about the fictitious company architecture.
OSPF Protocol – OSPF, Open shortest path first is a dynamic routing protocol which creates a topology between the routers to distribute routing information inside an Autonomous system.
If you are not familiar with OSPF, don’t worry ! In this article OSPF will be explained in great detail.
Are you interested in design aspect of OSPF, many OSPF design examples will be covered in the article.
Maybe OSPF network engineering interview question is what you are looking for. Read more
I received couple of questions about the topologies and wanted to explain one of them in this post for everyone.
I used below topology in the video;
Left picture illustrates the triangle physical topology and right one for the square topology.
Distribution layer devices are advertising the same networks in both topology. It says router but it could be the Multilayer switch as well.
Assume we are running OSPF but using triangle instead of square applies to any other IGP protocol ( EIGRP , IS-IS , even RIP ).
The reason you want to use triangle topology is high availability.In the left topology if the link between core and distribution layer fails, will not be any routing protocol convergence since the core devices will do the ECMP ( Equal Cost Multi Path) towards distribution, and distribution will do ECMP towards core thus all the links will be in the RIB and FIB so will be used actively. ( Flow based load-balancing ).
For the square topology; if the same link fails , since the left core device to destination prefix through the other core device metric is higher than the direct (failed) path , there is no equal cost and unless you enable Unequal cost multi path with EIGRP , you can’t place two routes for that prefix in the FIB. ( You may want to check OSPF Optimized Multipath draft ).
Question : In real life deployment , would we announce the same prefix from the two different distribution switches as depicted in the picture ?.
Answer : Yes we do. If we have distribution layer as depicted in the picture, which mean we have access layer as well. If Access layer is layer 3 which mean, default gateway for the devices is the access layer switch, then access and distribution layer would be running routing protocol.And from the design point of view you would want to run OSPF since between distribution and core is also OSPF and you don’t want to have more than one IGP in your topology unless you have to.
I used layer 3 access as an example for the simplicity but, we announce the prefixes from both distribution layer devices with multilayer access design ( Access-Distribution Layer 2 ) with or without MLAG ( VSS , VPC , MLAG with ICCP ). If you are using MLAG based solution, it is a matter of the number OSPF neighbour ship counts. I would want to see your comment if you know/guess the reason.
RFC 2547 defines standard MPLS VPN to carry customer prefixes over the MPLS backbone.
In February 2006 RFC 4364 was published for Inter AS VPNs which is known as Multi AS VPNs. RFC 4364 obsoleted RFC 2547 and defined many other applications for MPLS VPNs such as CSC which is also known as Carrier Supporting Carrier with the Cisco terminology and Carrier of Carrier with the Juniper definition.
With basic Layer 3 MPLS VPN , Enterprise customers can carry their prefixes from multiple sites over SP backbone . It is multi point to multipoint connection. With the ATOM based MPLS solution which is Cisco’s E-Line solution , customer sites are connected as point to point and with VPLS multipoint to multipoint.
Basic difference with the VPLS and IP/VPN from customer point of view , with VPLS all attached sites share the same L3 network.Service Provider acts as a big switch for the customer. IP/MPLS VPNs use different IP address at each site.
With the IP/VPN also known as BGP or L3 VPN , customer runs IP routing protocol or static route with the Service provider and Customer equipment which is known as CE don’t see other CE as connected like in VPLS or ATOM based MPLS.
Depending on expectations of the customer from the Service Provider , for the MPLS L3/VPN case, customer can run any of the IGP routing protocols including EIGRP, OSPF, IS-IS , BGP or static route. You may want to talk with your Service Provider before you decide since some Service Providers don’t service every routing protocols. Most of them if not all supports BGP.
If customer wants very granular policy control, dual homed site connectivity, and customer network stuff well trained , best choice would be BGP.
In the past fast convergence was an issue with BGP and maybe still with the vanilla BGP configuration, recent enhancements allow BGP to converge super fast thanks to BGP Fast Reroute Mechanism which is BGP PIC.
All IGP protocol’s metric information can be carried over SP MPLS backbone end to end. In this case SP core behaves differently. For OSPF there is Superbackbone and for ISIS there is L3 backbone concepts. This is out of the scope of this post so I will not explain further.But if you want to learn and interested please comment, so I definitely write about them.
One another caveat for PE-CE protocol , for almost all protocol , if customer has backdoor link to another customer site, loop or suboptimal path usage may occur. We prefer generally MPLS link when it is necessary to have low latency , secure , reliable connection compare to Internet based option.
If customer has backup Internet link (Not MPLS but maybe DSL, 3G/LTE, Satellite,Microwave, Cable) and its requirement is low latency , predictable delay variation which is called as jitter , reliable and secure ( Relative ) connection, probably wants to use MPLS connection as primary and Internet connection as a backup although LTE is much cheaper and provides very high bandwidth nowadays and started to take its place as a primary connection on some networks or part of the network such as remote offices.
When an ABR receive type 1 router and type 2 network LSAs within the area , it will only send the reachability information to another area. ABR is the choke point where topology information hides and only reachability information sends between the areas.
When ABR sends a summary type 3 LSAs into another area , it says I can reach network 192.168.1.0/24 , 192.168.2.0/24 etc and you can reach these networks through me. But ABR will not send with summary LSA , if you want to reach 192.168.1.0/24 send first to me and I will send packet to Router A , Router A will send packet to Router B and so on. Which mean is ABR will hide topology information.
Internal routers within area will believe what their ABR says , they cannot calculate end to end path since they dont know full topology. This is distance vector behavior. Is not the same with EIGRP ? At every hop EIGRP router calculate the best path which is feasible distance to destination and sends it to next router.
Here one thing is important ; calculate and then send !. First calculation is done so , other routers wait this router to finish its job and send the route with its the best path , receiving router puts these information from all the sending router to EIGRP topology database and then run DUAL.
But with link state protocols both OSPF and IS-IS , when they receive LSA , they first floods the LSAs to their neighbor and starts to run SPF, of course here we are talking about msec level.But within large environment with the lots of links these can be a scaling issue.
Likewise, if an ABR receives multiple Network Summary LSAs from other ABRs across the
backbone, the original ABR will choose the lowest cost advertised amongst them and create NEW summary LSA and send into attached areas. 2 things are important in this sentence :
1. It will create new summary LSA , because unlike Type 5 external LSA , Type 3 summary LSA is limited to Area , when ABR receive summary LSA from another ABR , It creates new summary and sends to attached areas , this can be scaling issue.
2.ABR needs to create new summary Type 3 LSA for every attached area. This is also can be a scaling issue. From here we could discuss How many ABR per area ? or how many router per OSPF domain but lets keep this for another blog post.
Lets create one scenerio where ABR connects Area0 backbone area and Area1 internal area. First lets talk when we have just one ABR and then redundant ABRs.
If there is only ABR and we cannot add second ABR or high availability is not a big concern at this time , hiding topology and reachability information is not a big issue. Only concern would be within Service Provider environment which if They put their PE into Area 1 and Area 1 does not have specific routes to exit from Area 1.
Because PEs needs to know each other loopback interface as /32 host route to create LSP for transport label. So the solution might be leaking PEs loopback into Area1. If ABR in this situation goes down , all internal Area 1 routers will lose their connectivity to the domain.
To better idea to put more than one ABR per area for high availibility. In this case still route leaking for special purpose is needed but also there are at least two more caveats.
First , if we hide reachability information ( By default we hide topology information ) internal Area 1 routers will chose the best path according to their cost to ABR but since those internal routes dont know the entire topology of OSPF domain, after ABR to the destination in the Ospf area 0 can be a sub optimal. This if from edge to core sub optimality consideration, it may or may not be an issue for your network and its application. But if you have tight latency requirements this may affect your design choice.
Second is blackholing could be a problem if we are also hiding reachability information from Area 1 to backbone area . In this case , from one of the ABR to one of thr internal Area 1 router link goes down , traffic still can reach for the network behind that internal router to the ABR which failed link. And traffic could be blackhole.
General idea and solution when we summarize from two ABR like in this situation , put the link between the ABRs and sends the routing information over that link. Here another big consideration is coming to play , Which area would you put that inter ABR link ? Area 0 ? Area 1 ? Lets keep this to nother blog post.
OSPF ABR ( Area Border Router ) is the router that non backbone area routers use it as exit from their area. In the OSPF RFC it is stated that if one router has two interfaces on the two area it can be considered as an ABR.
But in the implementation , in order to have an ABR , router needs to have at least one interface is connected to area 0 which is backbone area.
In the figure, R3 is the backbone router but not an ABR since it has only connection to the Area 0. In order to be an ABR , router needs to have one interface in Area 0 and other at least one interface in different area.
R2 and R6 has a connection to both Area1 and Area 0 thus they are ABR. Also R4 is an ABR since it has a connection to both Area 0 and Area 2.
ABR aggregates topology information from one area to another area. What does it mean ?
In the figure although R3 seems only one router in the backbone area, let’s consider we have other routers in the area 0 as well. Their connections to each other and metric informations is not sent to other areas by the ABR.
Instead , ABR only sends the reachability information. The routers in different areas only know that they can reach each other but with the information provided by the ABRs.
Thats why , if you design multi area OSPF network , you can’t have end to end visibility. A router can have topology information about its own area only,
You need OSPF or IS-IS to distribute link information such as reserved, unreserved and used bandwidth, metric, link colouring information.These informations are used by CSPT ( Constraint based shortest path first ) algorithm.
For those who are familiar with MPLS-Traffic Engineering, path is calculated either at each and every device or with the offline computation tools such as NMS from the central place.
For the distributed computation, CSPF which is one of the flavour of Shortest Path First (SPF) algorithm is used.
CSPF computes a dynamic unidirectional MPLS TE LSP ( Label Switch Path ) by reaching the Traffic Engineering Database (TED).
TED database has different attributes than regular link state database which is created such as reserved , used , unreserved bandwidth on the interfaces, link colouring attributes and so on.Link colouring information is used to avoid SRLG ( Shared Risk Link Group ) path at the transport network.
These information can only be provided by the link state protocols. Thus if you want to calculate the MPLS TE LSP without helping the NMS ( Network Management System ) but on each and every LSR as distributed, you need to use link state routing protocols which are OSPF and IS-IS currently.