This question comes from not only from my students but also the companies which I provide consultancy.
I will not go through the OTV details, how it works, design recommendations etc. But let me remind you what is OTV and why OTV is used , Where it makes sense very briefly.
OTV (Overlay Transport Virtualization) is a tunnelling mechanism which provides to carry Layer 2 ethernet frame in IP. (As I indicated in other articles, when I say MAC in IP, it is the same thing with MAC over IP).
So, OTV is Layer 2 in Layer 3 tunnelling mechanism. You can hear it is an encapsulation mechanism as well, which is true although there is small difference.
You don't need to have MPLS underlay to create OTV tunnels. It uses IS-IS for the MAC address reachability and stops layer 2 protocol PDUs at the OTV Edge device where encapsulation happens.
This is good because, you don't want to extend Layer 2 protocol PDUs such as Spanning Tree if you have multiple datacenters. Failure stays and affects only one datacenter, not all. (Failure domain boundary concept)
Another datacenter interconnect requirement is ARP proxy. Not a mandatory but it is good that your tunnelling mechanism (I should say Layer 2 extension mechanism probably) provides a way to reply ARP messages locally. OTV provides this functionality as well.
Should I use Cisco OTV for the Datacenter Interconnect ?
There are some problems and I will highlight two most obvious ones and especially one of them might stop many networks to use OTV.
Since MAC reachability information is carried through IS-IS, you can have scalability problems to carry MAC addresses through IS-IS. BGP allows to scale up to millions MAC or IP prefixes. (EVPN, PBB-EVPN)
Also I think there are some implementation limit for OTV. Up to some amount of locations can participate in overlay. But here we should be fair and say that although VPLS doesn't have this limitation, it doesn't make sense to use VPLS for 10 or more datacenter interconnection due to data plane learning and we all know the problems of data plane learning I guess.
By the way 10 is not a calculated number at all, I just used as an example. If the number of MAC addresses are more per datacenter probably even less number of datacenter interconnection can cause a problem.
So, OTV and IS-IS, for the large number of MAC addresses per datacenter can be a problem but definitely this is not the case for many networks today. If we are talking about Massive Scale Datacenters, they don't use layer 2 extension, and in fact they don't use layer 2 protocol inside the datacenter either. (BGP, specifically EBGP they use and there are many reasons for it, let's talk about them in a separate article)
Another problem with OTV of course is it is Cisco Preparatory. If you want to use different devices at the DC-Edge of your network, you cannot. OTV is not interoperable with the other overlay technologies. You cannot use OTV together with VPLS for example. I intentionally compared VPLS with OTV throughout the post, because VPLS is one of the most commonly used Datacenter interconnect mechanism in today networks.
Again, let's think the real networks. Do you really care using multiple vendors ? Or if Cisco gives you better price, better support and good documentation, most importantly best sales engineers ( ?? ), do you still consider not to be vendor lock-in ? Or do you think decision is taken for the political reasons in your company. Let me hear your thoughts in the comment section below. To have a great understanding of SP Networks, you can check my new published Service Provider Networks Design and Perspective Book.