There have been two recurring themes I’ve heard recently around why SDN and OpenFlow don’t make sense. I’m going to pick on Ivan at ipSpace.net, but that’s just because he’s put the arguments out there in the most digestible form I’ve seen. There are lost of other places I’ve seen heard the same thing.
The two themes are:
- Centralization doesn’t scale and won’t work in real networks. (See this.)
- OpenFlow is a poor fit for use case X. (See this about overlay networks.)
Honestly, Rob Sherwood does a great job of deflecting both of these when he discusses how modern OpenFlow fully exploits hardware and modern SDN controllers can provide better scale-out and fault-tolerance than traditional networking.
However, here, I’m going to talk about my take on them and how both of these mistaken assumptions are actually symptoms of a broader problem in how we think about networking—namely how we fail to build clean layers in network solutions.
Logical Centralization in SDN
As everyone loves to point out, the most common definition of SDN is a logically centralized control plane that is separate from the data plane and open protocols to govern the interaction between the two.
I’d like to call attention to the word “logically” in that statement. It’s where both how we build SDN control planes gets tricky and where the claim that centralization doesn’t scale loses it’s validity.
As Greg Ferro points out:
OSPF is a distributed network controller. It does the configuration of the forwarding table from the control plane. Your welcome.
Grammar mistakes aside, he has a point. Almost all control plane protocols (OSPF included) try to provide some degree of logical centralization while actually being distributed so that they are fault-tolerant and can scale.
The key difference between legacy protocols and SDN controllers is that most legacy protocols have one model of where to draw that line between centralized and distributed and it’s baked into the hardware. When you have SDN controllers, you can choose what point you want anywhere from a single central controller to having an instance of the controller running for every device in the network.
Further, you can make that decision different ways for different parts of your network. For example you can have fully centralized traffic engineering rerouting elephant flows while having fully decentralized basic routing. This is more more or less what Google does in their B4 WAN network.
I gave a talk about how we should think about this—without too many presupposed solutions—at the first OpenDaylight summit. The talk video and slides (with good references at the end) are both public.
OpenFlow is a Bad Fit for Task X
There’s two parts to this. First, OpenFlow doesn’t actually fit networking hardware. Second, even if it does, it’s a kludge to implement higher-level features with it.
OpenFlow Doesn’t Work on Real Hardware
The first one is a pet peeve of mine because, I helped the ONF’s Forwarding Abstractions Working Group (FAWG) figure out how to make OpenFlow 1.3 and later be much more hardware friendly. We wound up defining something called Table Type Patterns (TTPs). This included being the inspiration for Broacdom’s OF-DPA that provides a 10 table OpenFlow 1.3 abstraction for their modern switching ASICS.
I should give all the credit for TTPs to Curt Beckmann and the rest of the FAWG as I got distracted by OpenDaylight pretty early on and have only recently got re-involved as part of an OpenDaylight project to support them.
Curt Beckmann and I gave a talk about some of this at the OpenDaylight summit—there’s a public video and slides.
Joe Tardo and others gave a talk about the Broadcom OF-DPA at the 2014 Open Networking Summit, which you can find in the video archives by finding the Developer Track talk counter-intuitively titled “Floodlight: Open Network Hardware Programming”.
Long story short, OpenFlow 1.0 was hard to make work on real hardware. OpenFlow 1.3 can be mapped to real hardware just fine, but takes some effort to define the right set of tables. The Forwarding Abstractions Working Group’s TTPs and Broadcom’s OF-DPA show how to get this done.
OpenFlow is a Bad Fit to Implement X
Ivan points out that there are simpler things that OpenFlow that might provide a way to build overlay virtual networks. This is always going to be the case. For example, x86 assembly language seems like a really crappy abstraction to start with to provide video playback, but it’s where we started and then we provided higher and higher levels of abstraction until we could provide a function that was basically “play this file”.
Similarly, OpenFlow isn’t the most natural fit for high-level tasks, but that’s kind of the point. We need layers of abstraction that sit between high-level tasks and the low-level way the they are implemented.
Taking away general-purpose, low-level access and thus reducing how we can reuse and remix underlying network functionality is exactly what we’ve been getting wrong when we provide purpose-built hardware and software/firmware. The whole part where we’ve mistakenly baked one trade-off point in the centralized-distributed point into our hardware is just another example of this.
The Bigger Problem
The bigger problem we have in networking is that we can’t seem to figure out how to provide layering for our solutions. Instead pick particular full stacks across the layers and associated design decisions (like trade-offs in centralization-distribution) and bake them into our solutions and hardware.
Instead, we need to start to actually provide pluggable elements at each layer and make them as open as we possibly can. OpenFlow is one good interface between the control plane and a good swath of different (sofware and hardware) data planes, but it’s not the only one. Similarly, OpenDaylight is working on providing a good way to provide a pluggable control plane with the intention of letting people pick their own trade-offs in the centralized-distributed design space.
Well said Colin. I see many signs that folks are awakening to the fact that the all-centralized dogma is just as flawed as the all-distributed legacy. Finding the right blend of distributed and central and making it easy to deploy through clean layering and decomposition of modules is precisely the answer. Many old-school networking experts are struggling to “think different”…
The value of the SDN architectural aprcoaph (Which is what SDN is, it isn’t a network solution and doesn’t do anything in and of itself, but rather lends itself to building solutions with global network view and more abstracted APIs than the device or flow table model) and controllers with their associated NBI, is that it completely abstracts the details of what southbound API is used to talk to the network devices. A controller based on the SDN architectural aprcoaph may or may not speak OpenFlow and the answer to that question is a solid don’t care from the Orchestration and Cloud OS layer talking to the NBI in the cloud stack. The power of SDN is that a controller can expose a network abstraction and the details of the device level implementation are completely hidden. I completely agree that developing, sharing, and eventually standardizing NBI is important and has the ability to be a game changer, but this is completely orthogonal to whether OpenFlow is the only, or even a good, south bound protocol to control some or all of the forwarding behaviors in a network. The ONF initially made the horrible mistake of positioning SDN as the tail and OpenFlow as the dog when they launched. Now that the interesting conversation in the industry is about the NBI, the ONF is at risk of becoming even more irrelevant in future because they don’t appear to understand that the NBI is the key to integrating virtual networking with the on-its-way-to-ubiquity cloud movement. The most innovative and important data center SDN solutions are being build without the not-yet-ready-to-control-anything-but-the-forwarding table OpenFlow protocol and the ONF needs to have jurisdiction over the interesting decisions for the industry or become super-irrelevant as the flow-table-wire-protocol foundation. NBI is really important, but that has almost nothing to do with OpenFlow and whether it will ever be a comprehensive protocol for controlling network devices.