What we can learn from data centres on redundancy

Submitted by fredrik.nyman on Tue, 02/05/2019 - 09:23

Last week I wrote about redundancy in FTTH networks and the fact that a layer 3 architecture makes redundant topologies easier to implement and operate. One area in modern networking that recently has embraced layer 3 is the data centers.

The past few years have seen a lot of development in the data center world. New companies have emerged and new technologies have changed the way network equipment and architectures in data centers are built. I'm thinking about technology such as SDN with whitebox switches and control plane separation. OpenFlow was one such protocol that for a few years gained some momentum for its fine-granular control of traffic.

Image removed.

In data centers with thousands of virtual machines the network architecture quickly becomes complex as these virtual machines need to communicate in private or semi-private networks at the same time as a redundant network architecture is needed to provide high availability. The old VLAN and spanning-tree protocols do not scale well enough to handle the criss-crossing of connectivity between different virtual machines and even between different data centres. In the really big data centres we are talking about tens of thousands of virtual machines, so VLAN and MAC-address scalability quickly becomes an issue with a layer2 topology.

In response a protocol called VXLAN emerged to offer layer 2 connectivity over layer 3.

At first glance VXLAN is just another tunnling protocol. We have seen this before. L2TP was and is still used for this particular purpose. GRE in some implementations also supports layer2 payload. Actually a variant to GRE (NVGRE) was also on the table but VXLAN seems to have emerged the victor. Currently the IETF is working on standardized version (Geneve). The concept is called layer 2 overlay - a layer 2 network over a layer 3 topology.

As networks grow in scale they tend to go down the MPLS route to create and separate private networks over the same infrastructure. So why one more protocol, what was wrong with the multiple options that already existed?

Well, VXLAN has a couple of key benefits. First, it separates control plane and forwarding plane in a way where the forwarding plane can easily be implemented in hardware ASICs. L2TP has a lot of signalling which makes hardware acceleration difficult or at least significantly more integrated with the control plane. With VXLAN the packet format is simple and thus the lower-cost ASICs used in data centre switches can implement the technology in hardware gaining performance.

Secondly, the VXLAN tunnels are stateless. In MPLS you need to configure every end-point to make it part of a private connection. This requires manual hands-on or advanced automation systems to handle reconfiguration of equipoment. It's also a bit complex to mix and match end end-point configuration in MPLS.

In VXLAN, the forwarding is based on Ethernet-like tables. A MAC-address+VLAN is associated with a VXLAN end-point. So when a frame arrives the switch looks up in its forwarding table where the frame should be bridged. If the destination is over a VXLAN tunnel the switch creates the necessary header dynamically and sends the packet. No need to pre-configure the tunnel end-points - the end-points need not be aware of each other's existence until there is traffic to pass. This reduces the strain on configuration and control.

Different means, including EVPN, has emerged as options to signal the control-plane information (MAC+VLAN forwarding table information), which also means that VXLAN can integrate nicely with BGP for signalling. But other options also exist as for some topologies even BGP becomes overcomplicated.

To solve those two problems of VLAN/MAC scalability and redundancy in the data centre leaf-spine, VXLAN technology allows the whole data centre to operate on layer 3 with well proven redundancy and load-balancing solutions. Routing also means that no MAC-addresses needs to be kept in the spine for the thousands of virtual machines connected in the network. This reduce load and simplifies the spine. Routing also means that cross-connection between data centres can be routed and still allow layer 2 connectivity between virtual machines in different data centres thereby supporting scaling.

Image removed.

So the whole data centre infrastructure lies on a routed network with scalable and easy operation through routing protocols. Still full layer 2 connectivity is preserved with hardware accelerated performance through the VXLAN protocol. The cost is a little more overhead on the links, but that's a bargain for the operational benefits in form of flexibility, redundancy and scalability.

So what can FTTH learn from data centres? Well, why not use VXLAN for bitstream/wholesale services in fibre networks? Why not use VXLAN instead of customer VLANs to connect each and every customer to the central BNG? Why not build a stable, reliable, easy to operate layer 3 routed infrastructure in the access and still provide the layer 2 connectivity and services needed to enable the full service portfolio?

VXLAN applied in fibre network has goot potential to provide the same operaitonal benefits to FTTH as it has done to data centres.

Blog posts

How do you troubleshoot IoT devices?

Submitted by fredrik.nyman on Fri, 02/15/2019 - 13:00

Continuing on the subject of troubleshooting the network. Troubleshooting MPEG video has the benefit of a user that can tell you if it doesn't work and you can simply ask that user if the problem persists once you have fixed it. But what if there isn't any obvious way to determine if things are working, for example is that trashcan really signalling that its' full or does the temperature device really update the building climate control properly?

How to see what your users see

Submitted by fredrik.nyman on Mon, 02/11/2019 - 10:21

Live broadcast TV is one of the most popular services in fibre networks. You can get high quality pictures because there is enough bandwidth to send video uncompressed. But the nature of broadcast media is that it is very sensitive to packet loss or jitter. There is no retransmission of packets because it is live – you can’t hold the stream to get a lost packet back.

FTTH is not like any other network

Submitted by fredrik.nyman on Fri, 01/25/2019 - 13:34

If you are working in network engineering, hands-on with the routers and switches in the network, you probably have seen your fair share of network problems. However well you build it there is always some intermittent issue, some complaining user, some application that doesn’t get the throughput, some website that is unreachable.

It’s part of the everyday chaos of running a network to deal with big and small issues.

The Way Better Blog

Submitted by fredrik.nyman on Fri, 01/25/2019 - 10:02

In this blog I will be writing about some of the topics, big and small, facing network engineers and fibre networks and the kind of challenges I have encountered working with our customers over the past 20 years or so.