Tech Blog :: Monitoring Drupal sites with Munin
Monitoring Drupal sites with Munin
One of the applications I've been working with recently is the Munin monitoring tool. Its homepage describes it simply:
Munin is a networked resource monitoring tool that can help analyze resource trends and "what just happened to kill our performance?" problems. It is designed to be very plug and play. A default installation provides a lot of graphs with almost no work.
Getting Munin set up on an Ubuntu server is very easy. (One caveat: a lot of new plugins require the latest version of Munin, which is only available in Ubuntu 10.) Munin works on a "master" and "node" structure, the basic idea being:
- On a cron, the master asks all its nodes for all their stats (usually via port 4949, so configure your firewall accordingly).
- Each node server asks all its plugins for their stats.
- Each plugin dumps out brief key:value pairs.
- The master collects all the data and compiles graphes as images on static HTML pages.
Its simplicity is admirable: Each plugin is its own script, written in any executable language. There are common environment variables and output syntax, but otherwise writing or modifying a plugin is very easy. The plugin directory is called Munin Exchange. (The latest version of each plugin isn't necessarily on there, though: in some cases searching for the plugin name brought up newer versions on Github.)
I set up Munin for two reasons: 1) get notifications of problems, 2) see historical graphs to spot trends and bottlenecks. I have Munin running on a dedicated monitoring server (also running Jenkins), since notifications coming from the web server wouldn't be much use if the web server went down. It's currently monitoring three nodes (including itself), giving me stats on memory (total and for specific processes), CPU, network traffic, apache, mysql, S3 buckets, memcached, varnish, and mongodb. Within a few days of it running, a memory leak on one server became apparent, and the "MySql slow query" spikes that coincide with cron (doing a bunch of stats/aggregation) are illuminating.
None of this is Drupal specific, but graphing patterns in Drupal simply requires a plugin, and McGo has fortunately given us a Munin module that provides just that. (The package includes two modules: Munin API to define stats and queries, and Munin Defaults with some basic node and user queries.) I asked for maintainer access and modified it a little - the 6.x-2.x branch now uses Drush for database queries rather than storing the credentials in the scripts, for example. The module generates the script code which you copy to files in your plugins directory.
Conclusions so far: getting Munin to show you graphs on all the major stats of a server takes a few hours (coming at it as a total beginner). Setting up useful notifications is more complicated, though, and will probably have to evolve over time through trial and error. For simple notifications on servers going down, for example, it's easier to set up a simple cron script (on another server) with curl and mail, or use the free version of CloudKick. Munin's notifications are more suited to spotting spikes and edge cases.
Google: TheBuckSt0p
Facebook: BenBuckman
LinkedIn
Github: newleafdigital
@thebuckst0p
Delicious: thebuckst0p
Drupal.org: thebuckst0p
You might want to check the munin_api module available from http://features.osinet.eu/munin-api-drupal : its design is very different from the one on drupal.org/project/munin, which is why it has been evolving separately and not maintained on d.o.
Munin certainly rocks at the OS level. We make a tool for the Drupal layer called Droptor. Should give you lots of Drupal data out of the box and should be easier to setup. :)
Good to know about both of those, thanks!
We love munin for server monitoring and also played around with munin plugins to customize output. It works quite well. We ended up creating custom rrd's as well as graphs directly in Drupal. This worked so well, that we needed a way to scale this idea to all of our client sites. In terms of monitoring, monitoring only make sense if it's done by outside (e.g. external server). So we started to create drupalmonitor (http://drupal.org/sandbox/lukas.fischer/1398102) which uses rrd as backend and a cool way to super easy expose metrics using hook_drupalmonitor().
Just wanted to tell this little story. We love the rrd graph, we love munin, but we tought munin plugins are kind of too complex and do not scale on lots of sites.
Post new comment
Don't bother putting in spam links. They'll be set to
rel=nofollowand will be removed and reported as spam shortly after submitting.