Whitebox switches make use of generic and generally inexpensive hardware along with a network operating system that can be purchased and installed separately. Often the hardware and software come from different vendors, and there are several reasons this practice is becoming more common especially in the data center.
First, the underlying hardware utilizes more generic components which, in theory, reduce the cost of the switch. This is as opposed to vendors that use proprietary chipsets sometimes claiming they are able to offer unique features as a result of their technology.
Second, decoupling hardware from the network operating system provides greater ability to centralize management and use programmatic tools such as Chef and Ansible. It also opens the door for a vendor-neutral approach to networking utilizing OpenStack which can provide vendor interoperability and manage network hardware as pools of resources.
Third, because these normally Linux-based network operating systems don’t rely on (or are minimally reliant on) the underlying hardware, they can be completely customized.
In recent years, whitebox switching has begun to trickle down from webscale-sized companies to large enterprise data centers that need some of the same features normally needed by only the biggest infrastructures in the world. Requirements such as network functions virtualization and a compostable infrastructure were common only in those huge organizations but are now being implemented in much smaller environments.
What I’m interested in lately is how this is relevant to the non-webscale enterprise. It’s still debatable whether or not there is much of a cost savings with whitebox switches at all, and even pretty good sized enterprise organizations do just fine with their favorite vendor switches. The campus access layer, for example, requires very few advanced Layer 3 features and tends to be somewhat static with regard to configuration changes. Rarely do changes to the access layer require any sort of orchestration or the deployment of end-to-end services.
If the compelling reasons to move to a whitebox model are cost savings, increased programmability, and customizable operating systems, I wonder what the actual benefit is changing an access layer from a single or only several models from the one vendor to a whitebox solution.
In my experience, access switches are among the cheapest network devices even from the biggest network vendors such as Cisco. Also, an access layer is typically comprised of many devices all doing the same thing which means it’s easy to get a deep discount when buying in bulk. Without comparing a Cisco and whitebox BoM, it’s hard for me to definitively say one option is cheaper than the other, but I suspect buying a cheap Dell switch and a separate NOS license will not give me huge cost savings.
Network programmability is certainly an advantage for busy network administrators, but the access layer is normally static enough to make programmability interesting but not necessarily a reason to overhaul an entire switching environment.
Typically, access switches aren’t replaced until they get very old and not when new features are released. I understand that there are exceptions like supporting 802.1x at the port level, for example, but normally these kinds of feature additions are made through iterative code upgrades. The access layer just doesn’t do that much to require significant customization and the frequent addition of new features. I know we can always think of exceptions, but keep in mind that I’m speaking of a typical mid-sized enterprise access layer, not a webscale company’s data center.
To me, the access layer is already pretty much a compostable part of the infrastructure. How many models do you have in your access layer? I can think of maybe three, and they’re all simple, cheap, and easy to swap out.
Even though I don’t see whitebox switching as extremely relevant to an enterprise access layer, I still really like the idea of centralized management and increased programmability so long as the cost is a wash. What I’m doing is taking steps to deploy just a few switches in the corner of the network such as a network closet servicing only a few endpoints. This way I can see first hand the real benefits or drawbacks of bothering with this technology in this often forgotten part of the network.
IP Infusion presented at Networking Field Day 15 introducing themselves as the whitebox NOS we’ve all used but never heard of. ZebOS, in particular, has been used by by OEM manufacturers for almost 20 years on hardware from vendors such as NEC, Brocade, F5, Riverbed, and quite a few others. They started in data centers but also provide solutions for service providers with the goal to provide the very best network OS for whitebox and NFV.
Their expertise and experience is in carrier grade networks, but in the last few years they’ve seen the shift to software at every level of the network. They developed a complete finished product specifically for end-users in the enterprise, OcNOS, and it’s NFV companion, VirNOS, both based on ZebOS.
Their NOS supports most layer 2 and Layer 3 features and is based on a modular design meaning customers can purchase and use only the features they need such as simple switching or advanced routing functions. ZebOS also provides data plane integration by supporting a variety of chipsets in order to create a hardware abstraction layer.
Hardware abstraction means that protocols have no dependency on the silicon itself. The hardware abstraction layer has dependencies on SDKs such as the Broadcom SDKs, but the protocol modules that sit on op of the hardware abstraction layer don’t have dependencies on the underlying hardware.
This opens up to us a new world of centralized network programmability for the access layer using REST APIs and NETCONF while also supporting traditional management methods such as the CLI and SNMP.
If the cost of licensing OcNOS and a small pile of Dell or Edge-Core switches is similar to what you would spend on a closet switch refresh, this is a great way to PoC whitebox switching at the access layer and provide the hands-on experience to inform us whether or it’s relevant to this part of the network just yet.
I’m not convinced that it’s time to schedule a hardware refresh of an enterprise’s access layer and move to whitebox switching, but I really like the idea of better centralized management and network programmability. I’m interested in testing this out in one small part of the access layer while keeping in mind that there’s nothing wrong with changing my opinion later on.