Apstra, Incorporated isn’t focused on new features, more advanced silicon, or some new widget. Instead, they’re offering a different way to look at networking. Apstra offers an early form of intent-driven networking that abstracts network programmability and allows network engineers to configure intent rather than device features. We expect the network to behave in a specific way, so we configure our intent accordingly. I was very excited to meet the Apstra team at Networking Field Day 13, and they didn’t disappoint.
The Problem: Network-Blindness
Derick Winkworth (@CloudToad) and Jeremy Schulman (@nwkautomaniac) gave an interesting presentation exploring the nature of a network as a single logical construct rather than individual pieces of hardware. Derick started off by describing a condition called “face-blindness.” Those suffering from face-blindness have difficulty organizing the images of facial features into an entire face. They can see the individual components, but they can’t see the logical construct the arrangement of these images creates – a face.
He explained that some see the network in the same way, calling the phenomenon “network-blindness.” In other words, the network doesn’t exist in and of itself; it’s actually an ideal state that exists in our minds. A network is a logical construction of various components such as cables, hardware, and configurations, and we expect it to behave a certain way. Unfortunately, those suffering from network-blindness aren’t always aware of the network as a logical whole and instead focus on its individual components.
This may sound esoteric, but it makes sense to me, and I feel it’s important to flesh this out because it forms the foundation of what Apstra is doing. What Derick focused on was context, and I really enjoyed hearing a fellow networking geek think about networks this way. Think of it like this: alone, each individual router, switch, cable, and line of code is not a network. However, each component is part and parcel of the greater context of the network in which it participates.
He elaborated by saying that many large networks are so complex that it’s just too difficult for an engineer to have an accurate big picture. I know from experience that Derick is right about that. Networking folks really do get lost in the weeds configuring devices and features in one small part of the network and [unintentionally] forget the big picture. We want to maintain the vision of the big picture, but it’s hard to do that when troubleshooting why DSCP markings aren’t treated properly on some switch somewhere.
Jeremy explained that Day 2 troubleshooting is probably 70% of the problem with networks. Almost immediately after an initial install, the network is going to be different than the original design. How does an engineer know the true state of the network at any given time? And probably more importantly, how does the current state differ from the expected state?
The Cure: Situational Awareness
From a technological perspective, the Apstra Operating System maintains what Jeremy Schulman calls “situational awareness.” This means AOS knows the big picture. It knows what the engineer expects of the network, so it can therefore detect when something isn’t right based on how the current state differs from expected state. What we expect from the network is the intent, and this is how Apstra cures network-blindness.
Under The Hood
Under the hood, AOS breaks up a particular service into its components with regard to configuration and expected state. AOS continually collects network state information, validates configuration, and ultimately performs device configuration.
The impression I got was that Apstra is taking the methods to create a programmable infrastructure that we’re already familiar with and packaging them into an extremely easy-to-use graphical user interface. Then AOS takes abstraction to the next level by providing all the built-in logic for the intent-engine. This is an overlay approach, not device-level programmability, and is what I feel SDN has been promising for years.
Picture a single piece of software that has all your Python scripts, Ansible playbooks and Chef cookbooks all rolled into one. In this case Chef is a good analogy because each network device needs an AOS agent installed in order to be managed.
Rather than use SNMP, AOS uses this agent-based method to collect network data, perform those often-forgotten validation checks, and configure devices. This is different than other vendors’ solutions because not only does AOS not use SNMP, but it’s also vendor agnostic.
During the first demo, Jeremy pointed out that the interface was so straightforward that a facilities person would be able to identify and fix a cabling problem in the data center. In the screenshot below, notice the two red cables and the alarms.
This means that AOS knows what the network is supposed to be like, and this is the context that Derick referred to. This implies there is either a large number of build templates that comes with AOS, or I’ll be spending a lot of time configuring my own. Maybe I need to stop configuring snowflakes, but as the platform matures, I wonder how the AOS intent-engine will reconcile and validate built-in templates with custom configurations.
AOS also validates running configuration in even a multi-vendor network. That means the folks at Apstra built so much logic into their software that you can swap out switches from different vendors and AOS would find the diffs automatically applying the necessary changes to produce the intended state. That’s just amazing. Even if you’re a one-vendor-shop, this is still utterly amazing.
Take a look at the screenshot below. The validation process alone is huge and something I would require before even attempting to integrate AOS into my network.
AOS utilizes pools of built-in resources to build configurations based on the parameters a network engineer sets for a service. The system uses a series of config files using extendable Quagga and Jinja templates, for example. When a device enters inventory, information about the device is gathered in order to make it available for use as a resource. Currently, version 1.0 supports 26 platforms including hardware running Cisco NXOS, Arista EOS/vEOS, Cumulus Linux, and CVX. Version 1.1 should be coming out very soon and support more platforms.
Something important to note is that the AOS configs are automatically generated using what Apstra considers best practice for a particular service. They determined these by speaking with the various networking vendors, but this does affect your ability to use certain proprietary features. Jeremy addressed this by explaining that AOS doesn’t really abstract device level mechanisms and instead abstracts entire network services comprised of smaller network services.
Therefore the abstraction AOS provides utilizes whatever features are available to compose the overall service regardless of the proprietary technologies involved. This means you get the base configurations that get the job done (provides a particular network service) and allows you to pursue DevOps goals around the edges.
I can’t imagine how much programming has to be built into the system in order to accommodate the many services and vendors used in today’s networks, which is why I’m not surprised to hear that the platform technology is currently focused mainly on layer 3 data center Clos designs.
To that end, Apstra embraces the open network-developer community and welcomes community development of software to interface with their APIs. This should expedite the development of the platform and its relevance in actual production networks. I assume we’ll see a lot of third-party integration tools show up in time, so keep an eye on their GitHub page for more details.
I do wonder if relying on AOS build templates will be OK with higher level network engineers designing and building large data centers. I want to feel comfortable with that, but I admit that I’d probably feel compelled to scroll through all the configs myself after AOS did its magic.
So far, AOS is geared mainly for the data center, which I completely understand, but it’s also a sign of the platform’s immaturity. I don’t mean that to be negative; in fact, Jeremy explained that “the platform technology of AOS is adaptable for just about any kind of network application you want to use.” I have to assume that as AOS evolves beyond version 1.0, its relevance to the campus, WAN, etc. will also be developed and evolve. Apstra’s goal is to follow the example of the server world from years ago when hardware and software was decoupled giving engineers incredible choice and control over the infrastructure.
There are networking vendors that have a slick GUI to manage piles of their switches, but AOS is vendor agnostic, community-focused, and intent-driven. These are major differentiators and not trivial whatsoever.
I don’t know too much about the business side of things, and I know that companies are bought and sold regularly in Silicon Valley. But from a technical perspective, the logical conclusion of what Apstra is doing is incredibly compelling. In the near term, this will change data center networking. In the long-term, this will revolutionize the way we look at networking altogether.
Pay attention for their next release coming soon.