Over the last few weeks I’ve noticed a few tweets and blog posts regarding the immaturity of network automation methods and the danger in utilizing those methods in production networks. Though I agree that processes always have room to mature and that wiggling wires in a production environment always poses some risk, I believe this new emerging narrative in social media makes several assumptions that aren’t necessarily true.
Several of these tweets made light of the fact that network automation is merely deploying some scripts to push configuration to a small (or large) number of network devices. Therefore, a misplaced comma or some other minor typo in the code would devastate the entire production network. This would be my conclusion as well if the premise was completely true, but I don’t believe it is.
I’m sure there are some network engineers out there that are quick to write a sloppy script and push config into production without proper review, testing and validation, but that’s not at all part of the network automation movement as I see it. NetDevOps applies the DevOps paradigm to network management which means there is great value placed on continuous improvement, testing and validation, collaboration and community. Yes, there is also the no-blame concept which may decrease the level of accountability for any single network engineer, but that’s more than matched by the emphasis on peer-review and built-in client-side mechanisms to prevent the network from blowing up. Therefore, the original argument that network automation is too dangerous battles against a straw man that has little to do with what’s really going on in the network automation trend.
Next, to say that network automation is clever but too risky to use in production presupposes that managing network devices via SNMP or directly on the command line one device at a time poses much less risk. Trends in network automation seek to decrease human error by eliminating human interaction with individual boxes. Assuming that proper measures were taken to test and validate configuration, this means that proper use of network automation techniques would actually decrease the likelihood of error, not increase it. For example, copying and pasting 10 lines of config into a router has no built in validation or rollback mechanism if the configuration doesn’t work right. Some lines of configuration would apply, though, resulting in an ugly effort to figure out what commands did take and then go back to delete them one by one. This is in contrast to pushing configuration objects to an open NX-OS device. If there was some problem with any part of the configuration, the rollback-on-error option would automatically rollback the entire configuration if a mistake was found.
I also understand that part of the concern is the awkward balance between efficiency, i.e. managing many network devices at once, and keeping the failure domain as small as possible, i.e. managing network devices one box at time. This tension makes sense to me because I’ve been in the dreaded network down scenario enough times that I know to avoid it at all costs. However, the cause of most network outages I’ve witnessed was human error at the command line. Therefore, decreasing the potential for this is a step in the right direction, not a move toward nuclear annihilation of the network.
I know that I’m making a small pile of assumptions as well. I assume that the network engineer utilizing Python, Ansible, open APIs and the variety of other means opening up to us is using them skillfully. A sloppy script will take down a network, but a careless application of the shutdown command or VTP revision number will cause significant devastation as well.
Therefore, it’s not the methods and techniques that are flawed, necessarily, but the network engineer. We make mistakes, and we need to find ways to mitigate those mistakes. Network automation absolutely does this, but just like any other professional endeavor, it requires skill to utilize effectively.
Also, there is a sentiment that network engineers’ jobs will disappear or change so much that their skills will not be very relevant. I see this assumption all over the place, and maybe addressing it should be its own post, but for now I’d like to address it simply by saying that a network engineer will always need to know what a VLAN is and what the K values are in EIGRP. Network engineers will always need to have a deep understanding of networking concepts because of the simple fact that they are managing networks. Yes, some elements of configuration may be abstracted; nevertheless, the serious and professional network engineer will still need to know the deep aspects of networking in order to make the magic happen.
Network automation is another tool in the network engineer’s toolbox.
Picture a veteran astronomer working late one evening, PhD diploma on the wall, coffee at the ready, and dual monitors flickering with incoming data from an exciting new object he’s discovered in the brilliant night sky. Quickly he and his graduate student assistants put together a Python script to analyze the data in order to deepen their understanding of this exciting new phenomenon in the sea of space.
When the astronomer turned aside to write this script, was he no longer an astronomer? Did the usefulness of using Python to analyze and manipulate awesome space data immediately negate any need for a serious understanding of astronomy? Of course not!
In this case, Python is just a tool in the scientist’s toolbox for being a more effective astronomer. In the same way, using the network automation techniques developed and promulgated by the open source community and vendors alike does not presuppose the network engineer is no longer relevant. In fact, network automation is simply another tool in the network engineer’s toolbox allowing him or her to be just that much better of an engineer.
I’m pretty new to this brave new world, and my passion is still certainly in traditional networking, but I can say confidently that no one is trying to re-invent the networking wheel. The industry is just trying to make the ride smoother.