The buzz around the architecture of Software-Defined Networking (SDN) got me to thinking about how this technology will impact NetFlow and IPFIX exports in infrastructures depending on the OpenFlow protocol. Well, I have some good news, it could create some great opportunities for flow collection and reporting tools.
To understand why switched and routed networks controlled by OpenFlow could impact how we collect NetFlow and IPFIX, I had to do some research to make sure I understood a bit more about the Software Defined Networking (SDN) architecture. As I read the white papers and web sites which endorse or describe the technology I came to realize that OpenFlow is in some ways similar to other ideas that have been tried in the past.
RFC 1633 - on IntServ developed by Cisco set aside bandwidth for flows but I don't think it scales very well. Cisco settled on DiffServ which like IntServ capitalizes on the 6-bit DSCP (Differentiated Services Code Point) field.
During the six-year research collaboration between Stanford University and the University of California the founders of SDN could have capitalized on where the above technologies fell short and may have even leveraged concepts from the Connection-oriented Ethernet architecture. This is purely speculation. Nevertheless, the two primary components of Software-Defined Networking (SDN) emerged from their efforts:
OpenFlow controls how packets are forwarded thorough network switches. OpenFlow by the way, is only one component of Software Defined Networking.
A set of global management interfaces.
One question that IT managers want answered is, if attempts by companies like Cabletron and Cisco to do something similar failed, why will SDN succeed? This is a great question.
SDN will become main stream for a few reasons:
Six companies that own and operate some of the largest networks in the world are behind it — Deutsche Telekom, Facebook, Google, Microsoft, Verizon, and Yahoo! formed the Open Networking Foundation (ONF). Right from the start, it has been marketed as a collaboration technology and not owned by any one single company that will try to take all the credit. This could be part of the reason why NetFlow is emerging into IPFIX because so many of Cisco's competitors don't want to be associated with NetFlow or be involved with possible trademark infringement which Cisco has never pursued with any vendor.
SDN is a solid technology that although slightly different from its aforementioned predecessors, it has been proven to work for over 10-years by multiple vendors.
All major switch vendors are embracing it. Examples include Cisco, Brocade, Juniper Networks, HP, Broadcom, Ciena, Riverbed Technology, Force10, Citrix, Dell, Ericsson, IBM, Marvell, NEC, Netgear, NTT and VMware.
You may have noticed that the 6 companies behind SDN (i.e. Deutsche Telekom, Facebook, Google, Microsoft, Verizon, and Yahoo!), are all heavily involved with cloud service offerings. You can bet that billing and or accurate statistics (i.e. non sampled) will be an important part of their enterprise implementations of SDN.
The Impact on NetFlow and IPFIX
How does all this impact Flow Collection and reporting? It isn't clear yet, but I think we can forecast on how flow collection could into the OpenFlow protocol. The OpenFlow white paper describes a slightly different flow technology from what some consider "traditional NetFlow." For example in OpenFlow, "a flow could be a TCP connection, or all packets from a particular MAC address or IP address, or all packets with the same VLAN tag, or all packets from the same switch port, but it doesn't have to be IP traffic." This is beginning to sound like Flexible NetFlow or more likely IPFIX. Notice that in the definition that it calls for "all packets" (i.e. not packet samples). There are several other areas in the technology that make clear reference to true flow technologies and if you look at the growing list of supporters, they nearly all support NetFlow, IPFIX or both.
SDN switches supporting OpenFlow provide the following four basic functions that impact flows:
The first packet(s) of a new flow are sent to a controller on a secure channel for decision making based on policies.
The controller decides based on policies if the flow should be added or removed from the flow table in all the switches along the flow path. Because of policies, some flows (i.e. connection requests) could be dropped to curb (e.g. DoS attacks, broadcast discovery traffic, etc). Example policies include:
Guests can communicate using HTTP but only via a web proxy.
VoIP phones are not allowed to communicate with laptops.
Flows granted a connection are programmed into the switch fabric and forwarded at line rate.
{optional} Support for traditional layer 2/3 forwarding logic for environments that are not ready to commit 100% to OpenFlow.
Keeping the above 4 points in mind, the controller maintains the flow table and this is where NetFlow or IPFIX will likely make a play in this technology. An entry in the Flow Table has three characteristics:
A packet header that defines the flow tuple which can include up to 10 fields. Each field could be a wild card which allows for aggregation of flows. OpenFlow describes a 10 tuple: In Port, VLAN ID, Source and Destination MAC and IP addresses, Ethernet type, IP Protocol as well as the source and destination IP ports. Be careful defining the tuple for aggregation as sampling can usually be avoided.
The action, which defines how the packets should be processed (e.g. dropped, given priority, etc). This will be a great edition to flow reporting if this metric is exported.
Statistics, which keep track of the number of packets and bytes for each flow, and the time since the last packet matched the flow. This time could help with the eventual removal of inactive flows.
If NetFlow or IPFIX are to play a role in OpenFlow, exporting of flows would require that characteristics 1 and 3 above work hand in hand.
Flows from the Switch or Controller?
In a Software Defined Network, the controller can take over DNS, DHCP and authentication services from legacy servers. It can also assume responsibilities for VLAN configuration, ACLs and other routines that have traditionally been defined at the switch or router. The result, if a company wants to leave VLANs intact, this can be accommodated. What's nice about centralizing these responsibilities with the controller is for example when a wireless device such as a BYOD handheld is moving through the network, the controller tracks the location of the device and reprograms the flow tables in the switch fabric accordingly. As the users move through the network a seamless handoff from one access point to another is performed. With the controller providing all of the initial setup and eventual tear down of connections, it seems logical that it would also export flow details using NetFlow or IPFIX.
Another question is will the flow information representing the connections through the SDN be exported by the controller or will they continue to be exported by the switches themselves?
Answer: Your first inclination might be that it may not matter because as long as the flows get exported to the collector and the reporting utility can display the data the end user wants to see, maybe it doesn't matter where the flows come from. In many cases this could be true, but not in all cases (e.g. encrypted tunnels).
Based on what I read, flows from an SDN could trigger a new breed of flow reports. For example, if the controller exports details on the virtual overlay network, details on the hosts and applications inside encrypted tunnels could become available. On the other hand, flows coming directly from the switches and routers will likely be missing this information.
There's more good news as described above in the three characteristics of the flow table, since the controller contains statistics on each flow, it might be able to export unsampled flow details when the underlying hardware supporting the connection fabric consists of sFlow switches. Enterprises that compromised on statistical insight by investing in packet sampling sFlow switches may end up gaining NetFlow or IPFIX accuracy with 100% representation of the data coming from the controller.
More OpenFlow Questions
1) Can a centralized controller be fast enough to process new flows and program the flow switches?
A low-cost desktop PC acting as a controller can process over 10K new flows per second. This is more than enough capacity for a large college campus. Some enterprise class NetFlow and IPFIX collection and reporting solutions can process well over 100K flows per second. If flow volume from a single device exceeds the 100K flows/second threshold, a less granular tuple can be defined whereby increasing aggregation and unlike packet sampling, we can still maintain 100% accuracy.
2) What if the controller fails?
Redundancy issues are addressed by making the controller stateless, allowing simple load-balancing over mulitple separate devices.
Three years ago, the big driver of SDN was, "If I do this, I'll get $1,000 switches," Joshipura said. However, cheaper gear isn't the current payoff for those deploying SDN, because most implementations will remain in hybrid networks with traditional gear for a long time, he said "For the next three to five years, until we get to mainstream SDN, cost is not the primary driver of SDN."
Last August, SDN analyst Mark Leary blogged:
"The low-cost generic network device will not succeed in the software-defined networking (SDN) environment."
Well, if cost isn't the primary driver for SDN, perhaps the deeper insight provided by flows from the controllers on encrypted traffic is. I think the primary driver for SDN is "enterprises and carriers gain unprecedented programmability, automation, and network control, enabling them to build highly scalable, flexible networks that readily adapt to changing business needs." My position is that I'm hoping NetFlow and IPFIX will continue to play a role in it.
Below is the official response to Mike's above blog post from SDN Analyst - Mark Leary:
Hey Brad,
Mike is spot on in highlighting the ability to implement SDN and OpenFlow piecemeal without undoing, or worse yet, breaking our current networking systems and processes. Let's face it...In networking, incremental buildouts are a tremendous advantage. Mike is also spot on when he points to OpenFlow as a technology that builds on the work of certain distant past (SecureFast) and still very current technologies (NetFlow). I would, however, take exception to his labeling OpenFlow as a "respin" of these technologies. (NOTE: OpenFlow zealots would likely jump out of their chairs here!) I understand that as a NetFlow tool vendor, Plixer would like to send a message that NetFlow and OpenFlow are close relatives - even siblings. I, myself, see this as diminishing the value and impact of OpenFlow -- and the OpenFlow movement.
As far as the overall view presented in the blog... I see Mike falling into the same trap that many very intelligent OpenFlow technologists fall into... They equate OpenFlow and SDN. (Of course, the OpenFlow zealots out there would see this as perfectly fine. I'm also just as sure Plixer would be happy with the perception that SDN equals OpenFlow equals NetFlow.)
I look at SDN as a major shift in the way we architect and operate our networks. I look at OpenFlow as a standardized evolving technology that enables this architectural shift towards SDN. As an SDN-enabling tool, it promotes centralized network information processing, consolidated network control, and active network device direction. These are all good things - and all strong characteristics of an SDN environment. They are not, however, the only characteristics that mark an SDN in my opinion.
If your approach to SDN is to be broad-based (SDN is applied across the enterprise network) and multi-vendor (Interoperability is a must), then OpenFlow is certainly a fundamental technology in implementing an SDN environment. You should not pursue a multi-vendor, network-wide SDN deployment without OpenFlow. With that said, however, to fully take advantage of a truly "software-directed" network environment, OpenFlow would be one component of many required to complete your entire SDN deployment. Now, maybe OpenFlow expands over the next decade to encompass everything that is SDN. Then we can equate SDN and OpenFlow. Until that time, I couple, but do not equate the two.
In defense of my "coupled, but not equal" view, I'll offer the recent Plexxi announcement focused on the east-west traffic problem in data centers. They offer a complete SDN solution (albeit for a contained problem) without incorporating any OpenFlow technology at all. Here, Plexxi uses centralized control software to actively monitor workloads (Think applications.) and define logical resource groups (Think linked servers, storage, VMs...). Armed with this information, the control software then defines the underlying network -- a network that encompasses ethernet/optical switches, wave division multiplexing, and select network control functions. (NOTE: Plexxi splits network control functions between the central controller and the distributed devices.) Obviously, this is a closed system targeting a specific network segment and problem. But, in my view, it well represents an SDN solution at work in the real world.
None of the above should be taken as a slight against the OpenFlow movement or those suppliers aggressively rolling out OpenFlow-based SDN solutions (e.g., HP). OpenFlow developments all point to hastened and heightened SDN adoption and success in the future. For that, OpenFlow should be applauded... loudly!