What is Ganglia?

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.

Ganglia is a BSD-licensed open-source project that grew out of the University of California, Berkeley Millennium Project which was initially funded in large part by the National Partnership for Advanced Computational Infrastructure (NPACI) and National Science Foundation RI Award EIA-9802069. NPACI is funded by the National Science Foundation and strives to advance science by creating a ubiquitous, continuous, and pervasive national computational infrastructure: the Grid. Current support comes from Planet Lab: an open platform for developing, deploying, and accessing planetary-scale services.

No Comments

Ganglia 3.3.0 released

We are happy to announce the release of Ganglia 3.3.0. Highlights of this release are

  • Ganglia Monitor Core is now shipped with our second generation Ganglia Web UI.
  • Gmetad Daemon now supports sending metrics to Graphite (carbon)
  • sFlow supports additional metric sources such as JMX, memcache, Apache
  • Gmond now comes bundled with a number of additional metrics e.g. disk statistics, network interface utilization etc.

Full Release notes can be found on Github.

You can download the release from SourceForge.

Please report any issues with the release to our GitHub issue tracker.

No Comments

Putting metrics data on a common “bus”

Patrick Debois has kicked of an interesting set of projects to put metricĀ  information on a common “bus”. For example he has implemented a ruby based daemon thatĀ  parses Ganglia gmond packets and puts them on a ZeroMQ pub/sub bus. Once it’s there you can “subscribe” with a client of your choice and do transforms to the data e.g.

  • feed graphite or another monitoring tool
  • insert data into a SQL database
  • feed Nagios using passive checks

Thanks to Patrick for a great idea and implementation. Now let’s get to work on useful subscribers.

Gmond-zmq https://github.com/jedi4ever/gmond-zmq

Statsd-zmq https://github.com/jedi4ever/statsd

No Comments

Ganglia Web 2.2.0 released

Today we released version 2.2.0 of the Ganglia Web interface. New features present in this release are described in our “Upcoming Web Features post“.

Release can be downloaded from https://sourceforge.net/projects/ganglia/files/gweb/2.2.0/

If you need help installation guide is here.

No Comments

Upcoming Ganglia Web features

We have been working hard on new Ganglia Web features that will be part of Ganglia Web 2.2.0. These are the highlights

Compare Hosts

Allows you to compare hosts across all the matching metrics (this can mean hundreds of graphs :-) ). You supply a regular expression that matches a set of the hosts and Ganglia will aggregate all hosts for each metric. This is useful in those cases where you are trying to find why a particular host or hosts are performing differently then another set.

Compare Hosts - Ganglia

Built-in Nagios integration

This feature allows you to use your Ganglia trending data to alert in Nagios. There a couple nice addition to the basic check functionality e.g.

  1. Check heartbeat – as you may know gmond daemons sends a periodic heartbeat (every 20 seconds by default). If the heartbeat is missing it is fair to assume host is down. This should avoid you from having to use things like check_ping and alert you to potential down time much quicker
  2. Check multiple metrics – allows you to use a single check to multiple metrics on the same host ie. check that disk free on / is more than 30%, on /tmp more than 10% etc.
  3. Check single metric across multiple hosts (not yet implemented) – use a single check to check low disk space on a set of hosts defined by a regular expression e.g. instead of having separate disk checks for every host you would have a single check that would give you a break down of hosts that were not OK.

If you want to peak at how basic check_metric alert works check out Ganglia Nagios integration wiki document.

Aggregate graphs decomposition

While viewing aggregate graphs with more than 6-7 items colors will start to blend together and it may be hard to distinguish what on graph is what. This feature allows you to decompose a graph by taking every item on the aggregate graph and putting it on a separate graph e.g. a graph like this

Aggregate Graph - Ganglia

will decompose into this

Aggregate graph decomposition

Flot client side rendering

We have been using flot a Javascript graphing library for a while now. In this release we are planning to make it even more interactive ie. take items of graph dynamically etc.

Utilization heatmaps

In this release we are turning on utilization heatmaps instead of the old style pie charts e.g.

heatmap

Most of the features have already been implemented. We are still polishing up the release and writing documentation. We could always use more help with testing and documenting things so if you are up to it please join us on Freenode channel #ganglia.

If you’d like to test drive some of these changes please visit our demo site.

No Comments

Ganglia Web 2.1.8 released

Today we released version 2.1.8 of the Ganglia Web interface. Following changes are present in this release

  • Better way of showing all metrics when metric groups are initially collapsed. You can see the new behavior if you set $conf['metric_groups_initially_collapsed'] = true;
  • Fix for broken graph zooming
  • Show graph name in autorotation

Release can be downloaded from https://sourceforge.net/projects/ganglia/files/gweb/2.1.8/

No Comments

Ganglia Web 2.1.7 released

We have identified a bug that prevents host overview from working. This has been fixed in Ganglia Web 2.1.7.

Release can be downloaded from https://sourceforge.net/projects/ganglia/files/gweb/2.1.7/

No Comments

Ganglia Web 2.1.5 released

Ganglia Web 2.1.5 has been released. This is a minor release with following changes

  • Set upper and lower limits for aggregate graphs
  • Ability to add an event from the Ganglia UI
  • First pass at integrating Flot graphs into the UI (shows when you click on enlarge graph in host view)
  • Buttons for CSV and JSON exports have been changed to use CSS styled buttons
  • Collapse Metric Groups – hide all metrics unless you click on the metric group
    • Add this to conf.php => $conf['metric_groups_initially_collapsed'] = true;

Release can be downloaded from https://sourceforge.net/projects/ganglia/files/gweb/2.1.5/

No Comments

Ganglia 3.2.0 Released

The Ganglia Project is pleased to announce the official release of Ganglia 3.2.0.

The official tarball is available for immediate download at:

http://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/3.2.0/

This release includes:

  • sFlow support – more and even more
  • hostname/ip override – you can specify an arbitrary host name and IP to be shown in the UI.
  • FreeBSD patches
  • Python module improvements
  • Bugfixes and improvements over 3.1.7

Ubuntu PPA:

ppa:rufustfirefly/ganglia

Validated platforms:

  • Linux (Fedora/RedHat/CentOS, Debian)

Not validated platforms:

  • [Open]Solaris
  • FreeBSD
  • NetBSD
  • OpenBSD
  • DragonflyBSD
  • Cygwin
  • AIX

Ganglia Development Team

No Comments

Announcing Ganglia Web 2.1.1

Ganglia team is announcing the release of Ganglia Web 2.1.1. Notable additions are

Latest release can be downloaded from

https://sourceforge.net/projects/ganglia/files/gweb/2.1.1/

Please follow the installation instructions after the download.

Update: Jeff Buchbinder (@jbuchbinder) was fast and added missing Edit action to the API. You can now download 2.1.2.

https://sourceforge.net/projects/ganglia/files/gweb/2.1.2/

No Comments

Overlaying event timeline

In our “introducing overlay events” we added ability to specify events that are overlaid on top of graphs. Thanks to the work of Jesse Becker we now also support overlaying event time line. To best illustrate this is how overlaying event time line looks like

Events time line 1

This provides you with immediate context and allows you to better correlate metrics. It may also provide you with additional insight. Let’s say you saw something like this

Events time line 2

It may not be DB backup that is causing the load and you may want to investigate.

To use event time line all you need to do is supply both start and end time of the event e.g.

wget -O /dev/null -q "http://mygangliahost.com/ganglia/api/events_api.php?action=add&start_time=12340000&end_time=12340500&summary=Prod DB Backup&host_regex=db02"

We are working on a generic wrapper to run with any command that will populate the events API. Stay tuned.

To download please visit

https://github.com/vvuksan/ganglia-misc

https://github.com/vvuksan/cronologger

No Comments