Cameron Haight (@cameron_haight) and myself recently published research on how monitoring is applied to web-scale environments. Companies such as Amazon, Google, and Facebook run their environments using different fundamentals than typical enterprise IT organizations. This includes changes in infrastructure, management software, and the applications running on the infrastructure (among many other things including people and process which we don’t get into in this research).
In this research we cover some of the core fundamentals of both open source and commercial software systems which can support and often times are built with the same fundamental differences that distinguish web-scale environments. Many of these elements have to do with eventual consistency, size/scale, volatility, and the required performance of the applications which customers/consumers demand.
Further in the research we investigate the different ways data is collected, and once collected the elements of visualization, and analytics done by the user and the software to bring forth meaning in the vast amount of data collected.
We were able to build a presentation at the recent Gartner Data Center Conference in early December (in Las Vegas) where we converted this content and material into a presentation which looked at similar topics. We did a bunch of polling, which I should have results from in the next couple weeks. In the presentation we also dug into some of the open source (statsd, collectd, Graphite, and other associated projects for metric collection) and vendor supplied tools including those from AppDynamics, AppFirst, Boundary, Circonus, Data Dog, Librato, New Relic, Sumo Logic, and Splunk.