Nagios is a great product, it’s free, you can’t beat that. The problem is that the level of usability and sophistication of the product is pretty much zero. Don’t expect any bells and whistles, or really any usability for that matter. The technology is rudimentary at best, but it can get the job done with the right skills on staff.
Many vendors have introduced products which make Nagios more usable, these improve the product itself, the supportability, and the fact that you can get support when things break. The problem is that the underpinning and ugliness still exist once you get through the layers intended to cover up the mess that Nagios is. There are still scripted “checks” which run to determine service health, the checks are normally challenging to manage, especially when some execute through the agent, while others do not. Other features added include better management, dash boarding, and other basic capabilities that you would expect out of the box with any monitoring product.
The problem with all of these approaches is that they don’t auto-configure themselves, they don’t detect application instances properly or consistently, and configuration of checks is painful. Most clients using Nagios will hear me tell them to ditch it, and go for a simple and inexpensive monitoring tool. I hear from many Gartner clients who decide to implement open source tools based on a talented engineer on the team, but when he leaves the company no one can figure out how to safely upgrade nagios or it’s associated components (This article goes through some of what is needed to manage Nagios : http://www.debianhelp.co.uk/nagiosinstall.htm)
The time and effort needed to manage this software is much better spent buying a simple monitoring tool to get the basics covered for infrastructure health. Once you lick the easy stuff, infrastructure health monitoring, you can start focusing on the harder problems. Application performance monitoring (APM) tools are where most interest is since they facilitate end user experience monitoring, in depth troubleshooting capabilities, and provide much greater business value to the non-technical users.
[EDIT : 11/12/13]
Other Nagios related blog posts:
- Nagios : Let the religious wars continue
- How to properly leverage open source server monitoring (Nagios)
- Monitoring software sucks so I use Nagios, what’s a better approach?
Probably a better way to do similar types of monitoring, from a wider perspective: