There have been two recurring themes I’ve heard recently around why SDN and OpenFlow don’t make sense. I’m going to pick on Ivan at ipSpace.net, but that’s just because he’s put the arguments out there in the most digestible form I’ve seen. There are lost of other places I’ve seen heard the same thing.
The two themes are:
- Centralization doesn’t scale and won’t work in real networks. (See this.)
- OpenFlow is a poor fit for use case X. (See this about overlay networks.)
Honestly, Rob Sherwood does a great job of deflecting both of these when he discusses how modern OpenFlow fully exploits hardware and modern SDN controllers can provide better scale-out and fault-tolerance than traditional networking.
However, here, I’m going to talk about my take on them and how both of these mistaken assumptions are actually symptoms of a broader problem in how we think about networking—namely how we fail to build clean layers in network solutions.
Logical Centralization in SDN
As everyone loves to point out, the most common definition of SDN is a logically centralized control plane that is separate from the data plane and open protocols to govern the interaction between the two.
I’d like to call attention to the word “logically” in that statement. It’s where both how we build SDN control planes gets tricky and where the claim that centralization doesn’t scale loses it’s validity.
As Greg Ferro points out:
OSPF is a distributed network controller. It does the configuration of the forwarding table from the control plane. Your welcome.
Grammar mistakes aside, he has a point. Almost all control plane protocols (OSPF included) try to provide some degree of logical centralization while actually being distributed so that they are fault-tolerant and can scale.
The key difference between legacy protocols and SDN controllers is that most legacy protocols have one model of where to draw that line between centralized and distributed and it’s baked into the hardware. When you have SDN controllers, you can choose what point you want anywhere from a single central controller to having an instance of the controller running for every device in the network.
Further, you can make that decision different ways for different parts of your network. For example you can have fully centralized traffic engineering rerouting elephant flows while having fully decentralized basic routing. This is more more or less what Google does in their B4 WAN network.
I gave a talk about how we should think about this—without too many presupposed solutions—at the first OpenDaylight summit. The talk video and slides (with good references at the end) are both public.
OpenFlow is a Bad Fit for Task X
There’s two parts to this. First, OpenFlow doesn’t actually fit networking hardware. Second, even if it does, it’s a kludge to implement higher-level features with it.
OpenFlow Doesn’t Work on Real Hardware
The first one is a pet peeve of mine because, I helped the ONF’s Forwarding Abstractions Working Group (FAWG) figure out how to make OpenFlow 1.3 and later be much more hardware friendly. We wound up defining something called Table Type Patterns (TTPs). This included being the inspiration for Broacdom’s OF-DPA that provides a 10 table OpenFlow 1.3 abstraction for their modern switching ASICS.
I should give all the credit for TTPs to Curt Beckmann and the rest of the FAWG as I got distracted by OpenDaylight pretty early on and have only recently got re-involved as part of an OpenDaylight project to support them.
Curt Beckmann and I gave a talk about some of this at the OpenDaylight summit—there’s a public video and slides.
Joe Tardo and others gave a talk about the Broadcom OF-DPA at the 2014 Open Networking Summit, which you can find in the video archives by finding the Developer Track talk counter-intuitively titled “Floodlight: Open Network Hardware Programming”.
Long story short, OpenFlow 1.0 was hard to make work on real hardware. OpenFlow 1.3 can be mapped to real hardware just fine, but takes some effort to define the right set of tables. The Forwarding Abstractions Working Group’s TTPs and Broadcom’s OF-DPA show how to get this done.
OpenFlow is a Bad Fit to Implement X
Ivan points out that there are simpler things that OpenFlow that might provide a way to build overlay virtual networks. This is always going to be the case. For example, x86 assembly language seems like a really crappy abstraction to start with to provide video playback, but it’s where we started and then we provided higher and higher levels of abstraction until we could provide a function that was basically “play this file”.
Similarly, OpenFlow isn’t the most natural fit for high-level tasks, but that’s kind of the point. We need layers of abstraction that sit between high-level tasks and the low-level way the they are implemented.
Taking away general-purpose, low-level access and thus reducing how we can reuse and remix underlying network functionality is exactly what we’ve been getting wrong when we provide purpose-built hardware and software/firmware. The whole part where we’ve mistakenly baked one trade-off point in the centralized-distributed point into our hardware is just another example of this.
The Bigger Problem
The bigger problem we have in networking is that we can’t seem to figure out how to provide layering for our solutions. Instead pick particular full stacks across the layers and associated design decisions (like trade-offs in centralization-distribution) and bake them into our solutions and hardware.
Instead, we need to start to actually provide pluggable elements at each layer and make them as open as we possibly can. OpenFlow is one good interface between the control plane and a good swath of different (sofware and hardware) data planes, but it’s not the only one. Similarly, OpenDaylight is working on providing a good way to provide a pluggable control plane with the intention of letting people pick their own trade-offs in the centralized-distributed design space.