For all the talk about the hyper scale data centers, there hasn’t actually been a ton of detailed information about how they do the network stuff they do. This week, it was interesting to see some detail come out via blog/video from Facebook Network Engineer Alexey Andreyev regarding the design of the network for the new Facebook Altoona, IA data center. While most mainstream enterprise don’t relate to the scale, budget, and skillset/personnel of an organization like Facebook, there are some tangible takeaways. In other words, while a lot of their design constructs are built for scale, many of these principles can apply to smaller data center networks. Here are some of my key takeaways, and also food for thought the next time you “like” something or “tag” somebody…
Keeping it Simple
From the Facebook blog: “Our goal is to make deploying and operating our networks easier and faster over time…” I couldn’t agree more, and improving/simplifying Network Operations is a key area when we evaluate solutions as part of the Data Center Networking Magic Quadrant. One of the specific ways Facebook simplifies things is to automate wherever they can, which reduces manual error and scales much better (here’s a related blog on network automation).
Less is More
Facebook uses smaller, simpler and cheaper network switching infrastructure. This has direct applicability in the mainstream and we’ve published on it here: Rightsizing the Enterprise Data Center Network. Here’s Facebook’s take on it: “…it requires only basic mid-size switches to aggregate the TORs. The smaller port density of the fabric switches makes their internal architecture very simple, modular, and robust, and there are several easy-to-find options available from multiple sources.“
They refer to their design as a Core/Pod architecture, with a Pod containing 48 server racks, built in a leaf/spine architecture. Interconnectivity between pods is 40G and not oversubscribed, also in a leaf/spine architecture. The pod approach is modular, and allows them to evolve and iterate their network design as requirements change and technological capabilities change/advance.
While I think many network practitioners now realize that traffic is shifting from traditional North/South (app-to-user) patterns to East/West (app-to-app) the blog includes a powerful data point: “What happens inside the Facebook data centers – “machine to machine” traffic – is several orders of magnitude larger than what goes out to the Internet.” Similarly, Cisco recently reported a study that intra-data center traffic was 77% in 2013, and will remain high thru 2018. We’ve been seeing this trend for several years due to changing application architectures among other things. Net net, we recommend that new data center network builds should be a 1- or 2- tier Ethernet fabric that is optimized for both north/south and east/west with deterministic latency between any two points. We’ve published several pieces of research on data center fabrics including: Competitive Landscape: Data Center Ethernet Fabric and Technology Overview for Ethernet Switching Fabric.
The blog post doesn’t mention specific vendors, but Facebook has previously blogged about using disaggregated switching approaches with their own software (FBOSS) running on white-box style hardware. Here’s some more information on the topic of disaggregation, with additional research coming soon…
But Wait, no SDN?
While there’s no explicit mention of SDN controllers per say, they’re doing some interesting stuff. They run L3 ECMP using BGP, but there is a centralized BGP controller with “override” capability which sounds a bit SDN-ish to me. This wasn’t lost on other readers as well and Tech Target News Director Shamus McGillicuddy (@shamusTT) captured the sentiment very well via a tweet: “i want more details on the home-grown BGP controller that has “override” capability over the distributed BGP control plane.”