Archive for category Announcements

Overlaying timeshifted data

Often times when troubleshooting an issue it is useful to compare current data to data from a previous period. For example you observe high load on one machine and want to compare it to the load from a previous day. In the past you would simply open up two browser windows with data from each period and compare them. This may prove tricky as you may be dealing with different Y scales. To help with that we have recently added ability to overlay timeshifted data onto any host metric graph in Ganglia. What this means is that next to each metric graph you will see a button that says Timeshift. Clicking on it will overlay data from the same period prior to it e.g. if you are viewing hour it will overlay data from an hour ago, if you are viewing day it will overlay data from a day ago etc. For example this shows one minute load average overlaid with data from yesterday

Timeshift load one average

Timeshifted load one average

whereas graph below displays data from a week ago overlaid over this weeks data

Timeshift overlay CPU user

Timeshift overlay CPU user

Preliminary support for this feature has already landed in the trunk of our Ganglia Web repository

https://github.com/ganglia/ganglia-web

It is full usable although we are working on refining it. This should be included in Ganglia Web 3.5.0 when it’s fully ready.

No Comments

Ganglia Web 3.4.2 released

The Ganglia team is pleased to release Ganglia Web 3.4.2. Notable changes are:

  • Improvements to the Live Dashboard
  • Fixed the aggregate graphs metric auto complete which was broken in 3.4.1
  • Add ability to specify critical and warning thresholds which can be used in Live Dashboard and Views
  • Minor bug fixes

This release can be downloaded here.

No Comments

Ganglia 3.3.7 released

The Ganglia team is pleased to announce the release of 3.3.7. The following issues have been fixed in gmond:

  • BUG100: Fails to start in Solaris containers
  • Fails to start when no address on the network interface (added retry_bind parameter)
  • BUG321: Fails to start when Solaris CPU is in FAILED state (segmentation fault)

The release can be immediately downloaded here.

No Comments

Ganglia 3.3.6 released

The Ganglia Development Team is happy to announce the release of Ganglia 3.3.6. This release fixes the following bug:

BUG327: memory leak when receive channel is not configured or not hearing any data

No Comments

Ganglia 3.3.5 released

The Ganglia Development Team is happy to announce the release of Ganglia 3.3.5. This is mainly a bugfix release for 3.3.1 and all users of the 3.3.x series are encouraged to upgrade.

For more details regarding this release, please refer to the Release Notes. Downloads are available at SourceForge.

Work is already under way for the next release, please stay tuned for upcoming enhancements to the project!

No Comments

Upcoming Ganglia Web features

We have been working hard on new Ganglia Web features that will be part of Ganglia Web 2.2.0. These are the highlights

Compare Hosts

Allows you to compare hosts across all the matching metrics (this can mean hundreds of graphs :-) ). You supply a regular expression that matches a set of the hosts and Ganglia will aggregate all hosts for each metric. This is useful in those cases where you are trying to find why a particular host or hosts are performing differently then another set.

Compare Hosts - Ganglia

Built-in Nagios integration

This feature allows you to use your Ganglia trending data to alert in Nagios. There a couple nice addition to the basic check functionality e.g.

  1. Check heartbeat – as you may know gmond daemons sends a periodic heartbeat (every 20 seconds by default). If the heartbeat is missing it is fair to assume host is down. This should avoid you from having to use things like check_ping and alert you to potential down time much quicker
  2. Check multiple metrics – allows you to use a single check to multiple metrics on the same host ie. check that disk free on / is more than 30%, on /tmp more than 10% etc.
  3. Check single metric across multiple hosts (not yet implemented) – use a single check to check low disk space on a set of hosts defined by a regular expression e.g. instead of having separate disk checks for every host you would have a single check that would give you a break down of hosts that were not OK.

If you want to peak at how basic check_metric alert works check out Ganglia Nagios integration wiki document.

Aggregate graphs decomposition

While viewing aggregate graphs with more than 6-7 items colors will start to blend together and it may be hard to distinguish what on graph is what. This feature allows you to decompose a graph by taking every item on the aggregate graph and putting it on a separate graph e.g. a graph like this

Aggregate Graph - Ganglia

will decompose into this

Aggregate graph decomposition

Flot client side rendering

We have been using flot a Javascript graphing library for a while now. In this release we are planning to make it even more interactive ie. take items of graph dynamically etc.

Utilization heatmaps

In this release we are turning on utilization heatmaps instead of the old style pie charts e.g.

heatmap

Most of the features have already been implemented. We are still polishing up the release and writing documentation. We could always use more help with testing and documenting things so if you are up to it please join us on Freenode channel #ganglia.

If you’d like to test drive some of these changes please visit our demo site.

No Comments

Ganglia 3.2.0 Released

The Ganglia Project is pleased to announce the official release of Ganglia 3.2.0.

The official tarball is available for immediate download at:

http://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/3.2.0/

This release includes:

  • sFlow support – more and even more
  • hostname/ip override – you can specify an arbitrary host name and IP to be shown in the UI.
  • FreeBSD patches
  • Python module improvements
  • Bugfixes and improvements over 3.1.7

Ubuntu PPA:

ppa:rufustfirefly/ganglia

Validated platforms:

  • Linux (Fedora/RedHat/CentOS, Debian)

Not validated platforms:

  • [Open]Solaris
  • FreeBSD
  • NetBSD
  • OpenBSD
  • DragonflyBSD
  • Cygwin
  • AIX

Ganglia Development Team

No Comments

Announcing Ganglia Web 2.1.1

Ganglia team is announcing the release of Ganglia Web 2.1.1. Notable additions are

Latest release can be downloaded from

https://sourceforge.net/projects/ganglia/files/gweb/2.1.1/

Please follow the installation instructions after the download.

Update: Jeff Buchbinder (@jbuchbinder) was fast and added missing Edit action to the API. You can now download 2.1.2.

https://sourceforge.net/projects/ganglia/files/gweb/2.1.2/

No Comments

Overlaying event timeline

In our “introducing overlay events” we added ability to specify events that are overlaid on top of graphs. Thanks to the work of Jesse Becker we now also support overlaying event time line. To best illustrate this is how overlaying event time line looks like

Events time line 1

This provides you with immediate context and allows you to better correlate metrics. It may also provide you with additional insight. Let’s say you saw something like this

Events time line 2

It may not be DB backup that is causing the load and you may want to investigate.

To use event time line all you need to do is supply both start and end time of the event e.g.

wget -O /dev/null -q "http://mygangliahost.com/ganglia/api/events_api.php?action=add&start_time=12340000&end_time=12340500&summary=Prod DB Backup&host_regex=db02"

We are working on a generic wrapper to run with any command that will populate the events API. Stay tuned.

To download please visit

https://github.com/vvuksan/ganglia-misc

https://github.com/vvuksan/cronologger

No Comments

Introducing Overlay Events

One of the commonly asked Ganglia feature requests has been the ability to overlay events as vertical lines e.g. to show deploys. Unfortunately there was no built in functionality in Ganglia to do that but it had to be “hacked in”. For example in this blog post there is a description of one approach. Fortunately that is now history as we have added “Overlay Events”. This is a generic feature that allows you to specify a list of events including time (unix timestamp) and description as well as grid, cluster and host regex that this event applies to. This way you can limit an overlay event to a subset of hosts e.g. DB backup affects only the DB slave server. You will end up with something like this
Overlay events

To enable overlay events add following to your conf.php

$conf['overlay_events'] = true;

Events are configured using a simple JSON array. By default events are stored in the following file

$conf['overlay_events_file'] = $conf['conf_dir'] . "/events.json";

If you are using defaults that is

/var/lib/ganglia/conf/events.json

Example of the events JSON file used to create the above overlay looks like this

[
 {"start_time":1308496361,
 "summary":"DB Backup",
 "description":"Prod daily db backup",
 "grid":"*",
 "cluster":"*",
 "host_regex":"centos1"},
 {"start_time":1308497211,
 "summary":"FS cleanup",
 "grid":"*",
 "cluster":"*",
 "host_regex":"centos1"}
]

Currently only host_regex is supported but we are working on adding filtering by grid and cluster. All you now need to do is decide which events to include. Example events you can include

  • Start time of particular periodic jobs such as DB backups, DB clean ups
  • Deploys
  • Nagios alerts sent

Alternatively you can try the Events API e.g. I have added following command to be executed before my critical jobs start

wget -O /dev/null -q "http://mygangliahost.com/ganglia/api/events_api.php?action=add&start_time=now&summary=Prod DB Backup&host_regex=db02"

Change the start_time to a UNIX timestamp or a well formed date.

To download the latest release with Overlay events please visit

https://github.com/vvuksan/ganglia-misc

No Comments