Prior to Gartner, I was responsible for enterprise networks and what I worried about primarily was uptime. However, the industry has changed quite a bit over the past few years so I posed the following question to several current network folks: “What Keeps you up at night (from a network perspective)?” They had some interesting answers, although uptime was still tops…
Network Solutions Architect for a Consulting/Managed Services Firm:
Security issues on the network. Lack of proper design, as in zero trust, between zones. Old school thought process of InfoSec being separate from Networking keeps a lot of bad designs on the table. Also makes support a nightmare or over the fence type of issues.
My take: Ok, so one vote for network security and culture. We call this organizational misalignment and it is certainly a worst practice that we’ve written about: “…networking and security teams are not well-aligned and not collaborating as well as the company needs or expects. Engineering and/or architecture teams from security and network disciplines are thus under different reporting structures, and they sometimes do not reach a common leader until the CIO.” Further, security zoning is hard…Over-segmentation is bad…but wait, so is under-segmentation, and we have blogged/published about “Hazardous network segmentation” as a worst network security practice also. We do answer the question of “What security zones should we utilize to protect our enterprise?” in this published research: Decision Point for Postmodern Security Zones.
Network Architect from the Banking/Financial Industry:
Have I done enough to ensure technical staff are equipped to deal with what’s deployed?
My take– Kudos for getting beyond just “uptime” and focusing on one of the key root causes of network downtime, which is manual configuration error (often lead by undertrained staff). Also, I agree that networks are complex, and we’ve written about complexity and the accrual of technical debt as a worst practice. Summary: ½ vote for training and a ½ vote for uptime.
Information Systems Manager from the Retail Industry:
Depends on the day. Right now? Carrier routing changes! Took down two locations this month! Nasty to troubleshoot.
My Take: I’m still waiting to meet a network practitioner who loves their carrier. Another vote for uptime. Upstream failures in the carrier’s network are always challenging, especially when the local BGP session stays up. Diverse transport and newer path selection algorithms with application-level monitoring can help, which is driving interest in, and adoption of SD-WAN.
Network Manager from the Retail/Hospitality Industry:
That used to be a loaded question for me. Now all the little things that used to fall in to that category and sometimes equaled a bigger problem have faded in order to make room for many engineers’ worst nightmare in todays ‘new IT world order’; loss of control. In my case, that loss of control comes in the form of wide spread regional carrier outages and highly visible public facing outages of services living in the *aaS abyss. Now the first, I can handle. I just put on my parachute and jump in to the fire and let DMVPN over LTE do its thing from a branch perspective. Or from an on-prem datacenter perspective, I simply route around the problem. The second, well… The *aaS providers will tell you that outages will be a thing of the past. “Just move everything to the cloud and your life will be so much easier” they say. Not the case. Although I’m always thinking and creating contingency plans to work around these issues, the next looming revenue impacting outage lives with me daily.
My Take: You had me at “DMVPNoLTE”. Yet another vote for uptime. The cloud doesn’t fix everything (and neither does SDN). In all seriousness, I’ve heard this sentiment before and can see how this is unnerving to network folks…While you no longer “own” your network infrastructure you’re still responsible for performance and availability of the applications that your users access. Congratulations, “cloud broker” has been added to your list of responsibilities. The only fix to this is good ole fashion network design, including optimizing your network for the cloud and applying general best practices to avoid network downtime, while simultaneously avoiding worst practices.