Orhan Ergun 1 Comment

We discussed LAG (Link Aggregation Group) and the ECMP (Equal Cost Multipath) on real network deployments with the Service Provider/Telco Engineer engineers on my slack group.

 

I thought it was good discussion so you can see what others are doing and the reasons of their deployments. 

 

In this talk, three people involved. Myself , Engineer 1 and Engineer 2. Since the participants may not want to expose their networks publicly and I didn’t ask their permission, just I used Engineer 1 and Engineer 2, instead of their names.

 

Engineer 1 started the discussion. Nice one, hope you find it informative. Enjoy..

 

 

Engineer 1 

Hello guys, I hope you are all good.

 

I wanted to spark a little discussion about using LAGs vs ECMP in your networks. What are you using and what was the driver? Personally I use bundled interfaces.

 

It’s a bit easier to manage, less IP addresses is used, IGP LSDB is smaller, if member link goes out of service it’s hidden from IGP view.

 

Load balancing depends on platform, but I think it’s as good as in case of ECMP. Obviously with bundles we are restricted to pair of the same devices.

 

The downside is that it’s very hard to troubleshoot if LAG member link is malfunctioning, because traceroute always look the same. Also, if we need BFD support it may require micro-BFD which is not always available. With ECMP links we don’t have that problem. Any other gotchas?

Oh, and QoS sometimes is a pain on bundles

 

 

Engineer 2 

 

I prefer ECMP rather than LAG anywhere possible

 

BFD and link issue are easier to run with individual links

 

And for the LAG, it may cause congestion when one link member is down

 

For us, the IGP cost is fixed (manually) on the link. When the link member goes down, the cost remains the same

 

Hence the LAG link is more prone to congestion.

 

 

Engineer 1  

Well, with LAG you can rely on min-links, so if too many member links go down entire LAG will go out of service

 

 

Engineer 2 

Yes. But if there are only 2 x 40G in the LAG, we would have big problem.

 

 

Orhan Ergun

 

Depends on the place in the network as well in the datacenter mostly LAG and MLag is used , without even BFD, failure detection is much faster due to physical link detection and if you will run routing protocol, and this is the case in any large scale DC anymore , LAG reduces the number of adjacency as you stated above but on WAN , mostly ECMP is used

 

Those who have large WAN networks , may not have always ECMP available , in that case neither LAG nor ECMP available but everyone use traffic engineering

 

MLAG might be a solution if links will be terminated on the different devices but , again , on the WAN , almost no one uses it , MLAG I see on the DCI (Datacenter Interconnect ) and in the Campus designs

 

Also, business types might dictate certain topologies , for example retail enterprises mostly have hub and spoke hierarchy , in that case , from the spoke site when there are two links toward Hub location, which is generally HQ or Datacenter , most people use ECMP , regardless of underlying service , be it MPLS VPN , Internet , P2P Ethernet and so on

 

Also, since Engineer 1 is asking your deployment model , requirements and the constraints for it , please share what you have and why you have when you have a question, you can expect the others to do that then.

 

 

Engineer 2 

We used to have lots of LAGs and frankly, they work fine.

 

Occasionally, there is issue like the one I shared above that led to congestion

 

We counter by putting sufficient BW

 

We used LAG mainly because it is so easy to use. Just make sure the link quality is good, add it to the LAG, and it is done.

 

 

Orhan Ergun 

You use it on the WAN ?

 

 

Engineer 2 

Yes, @orhanergun

 

In SG context, WAN is around 70km

 

 

Orhan Ergun

Thats okay , but what is the topology , I mean layer 3 topology , not the optical one

 

 

Engineer 2 

The core is partial mesh ; while PE has dual links to two diff P

 

 

Orhan Ergun 

How LAG works between PE and two different Ps ? do you have MLAG (Multi Chassis Link Aggregation Group)  ?

 

 

Engineer 2 

Oh no. The LAG is between a PE and a P. Both  PEs will have two LAGs to two P. And use ECMP if the destination has equal cost

 

Engineer 2 

We run BFD for fast detection and failover

 

It is that simple 🙂

 

Orhan Ergun

On LAG bundles do you have micro bfd ?

 

 

Engineer 2

Yes.

 

 

Orhan Ergun

Ok I wont ask the devices

 

 

Engineer 2 

Interestingly, we use two vendors and the BFD works flawlessly

 

 

Orhan Ergun

Interoperability is fine ?

 

 

Engineer 2 

Yes.

 

Orhan Ergun

Couldn’t you do fast failure detection at layer 0, layer 1 ?

 

 

Engineer 2 

In some cases, we have DWDM in between. So L1 detection isn’t enough

 

Orhan Ergun

Ok

 

you have SDH too ?

 

Engineer 2 

No, we do not use SDH

 

SDH serves as our access/last mile

 

 

Orhan Ergun

Still you might need fast failure detection at your last mile for business customers

 

Are you using APS in that case ?

 

 

Engineer 2 

Oh, I don’t have access to that part of the network

 

But as far as I know, for SDH, they can provide 50ms for switching/protection.

 

 

Orhan Ergun

Then they have

 

 

Engineer 2 

 

We also have a metro-e network for access

 

And they use TE FRR if I am not wrong

 

 

Orhan Ergun

Are you using G.8032, REP , ERP etc ?

 

 

Engineer 2 

But it is not part of my network

 

 

Orhan Ergun

Any fast failure mechanism for ME ?

 

You mean, you have TE FRR in the access network ?

 

 

Engineer 2 

That’s what I understand

 

 

— One Comment —

  1. Nice discussion! I fully agree with all mentioned pros/cons for LAG vs ECMP.
    In addition to that I would say that QoS is a little bit different on LAG interfaces compared to the normal interfaces. For example – on cisco ASR1K devices – in order to have proper QoS on port-channel you need to prepare the platform in advance before creating the port-channel (platform qos….). Things are getting worse if you need QoS per LAG sub-interface (still technically possible, but much complicated)

Leave a Reply

Your email address will not be published.