What is Ganglia?

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.

Ganglia is a BSD-licensed open-source project that grew out of the University of California, Berkeley Millennium Project which was initially funded in large part by the National Partnership for Advanced Computational Infrastructure (NPACI) and National Science Foundation RI Award EIA-9802069. NPACI is funded by the National Science Foundation and strives to advance science by creating a ubiquitous, continuous, and pervasive national computational infrastructure: the Grid. Current support comes from Planet Lab: an open platform for developing, deploying, and accessing planetary-scale services.

No Comments

Ganglia 3.1.7 Released

The Ganglia Project (http://ganglia.info) is pleased to announce the official release of Ganglia 3.1.7 The official tarball is available for immediate download at:

http://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/3.1.7/

For a full description of the bug fixes and enhancements that are included in the 3.1.7 release as well as upgrade information, please see the current release notes at:

http://ganglia.wiki.sourceforge.net/ganglia_release_notes

Supported platforms:

  • Linux (Fedora/RedHat/CentOS, Debian, Gentoo, SuSE/OpenSuSE)
  • [Open]Solaris
  • FreeBSD
  • NetBSD
  • OpenBSD
  • DragonflyBSD
  • Cygwin (no support for DSO yet)
  • AIX (initial support for DSO now available – please provide feedback)

Please read all the README, INSTALL and other available documentation (http://ganglia.wiki.sourceforge.net) as a lot of things have changed since version 3.0. Use good deployment practices when upgrading from 3.0.x to make sure that you do not mix gmond 3.0 and 3.1 nodes in the same cluster (as defined by a multicast address or unicast collector node). The protocol that allows gmond nodes to communicate within the same cluster, has changed. However the XML packets that are passed between gmond and gmetad have remained compatible from 3.0.x to 3.1.x, allowing a 3.1.x gmetad to continue to pull data from an older 3.0.x gmond cluster.

Ganglia Development Team

[Update: Initial support for DSO is available on AIX]

No Comments

50% Discount for Hadoop World 2009

For those who don’t know, I joined Cloudera in January of this year. One of my first projects was to help create an Apache 2.0 licensed distribution of Apache Hadoop that is easy-to-install and stable. If you’re interested in running Hadoop on your cluster, you should take a moment to read my recent Cloudera blog post on our latest release. I’m thrilled to be working for a company that values open-source software. I’m also a committer on the Apache Avro project, where I’m working on the C implementation of the Avro marshalling/RPC protocol. Avro has great promise as a replacement for our current XDR/XML schemas. More on that in a later post.

For all the ganglia users out there that run Hadoop on their clusters and want to go to Hadoop World 2009, I’ve got a special discount code for you. Just use the discount code hadoopworld_friend_cloudera_ganglia when you register and you’ll immediately get 50% off the cost of registration (a drop from from $299 to $149.50). This discount code expires September 21st so don’t wait too long to register. I want the ganglia community to have a chance to be strongly represented at Hadoop World.

Hadoop World 2009 will be in New York on October 2nd, 2009. I invite you to come learn about what the following companies have done with Hadoop: About.com, Booz Allen Hamilton, China Mobile, ContextWeb, eBay, Facebook, IBM, Intel, JPMC, Microsoft, The New York Times, NexR, Rackspace, Vertica, Visa, Visible Measures, Yale, and Yahoo! If you have ever wondered what Hadoop might be able to do for you, this is your chance to learn both from leaders in the webspace and within your own industry.

No Comments

Ganglia Code Swarm

When I updated the look of the ganglia website, I couldn’t help but feel a bit nostalgic looking through the old posts and pages. I joined the UC, Berkeley CS Dept. back in 1999 and, after couple of internal prototypes, released ganglia on SourceForge in 2000. Since its release, we built quite a team around ganglia and I’m proud of our accomplishments. We’ve created software that people have relied on to monitor their infrastructure for nine years running. Even though ganglia is a bit of a niche product targeting clusters and Grids, it’s still been downloaded over a million times.

For fun, I decided to visualize the history of ganglia using a code_swarm from June 19th, 2002 to present (btw, June 19th was the day we moved from cvs to subversion).

Please keep in mind that this video only highlights the activity of ganglia committers but we’ve had so much help along the way. Many times, committers are checking in patches submitted by one of the scores of volunteers that have enriched our software over the years.

Sit back and enjoy the fireworks!

May there be many more sparks to come.

,

No Comments

Monitoring Hadoop Clusters with Ganglia

Apache Hadoop is an open-source implementation of MapReduce. Hadoop users will be happy to know that Hadoop has built-in support for publishing run-time metrics using Ganglia. For more details, visit the GangliaContext page on the Hadoop Wiki or Philip Zeyliger’s blog post on the Cloudera blog. Cloudera offers an Apache 2.0 licensed distribution to make managing Hadoop clusters easier.

, ,

No Comments

Ganglia 3.1.2 (Langley) Released

The Ganglia Project (http://ganglia.info) is pleased to announce the official release of Ganglia 3.1.2  The official tarball is available for immediate download at:

http://sourceforge.net/project/showfiles.php?group_id=43021&package_id=35280&release_id=661845

For a full description of the bug fixes and enhancements that are included in the 3.1.2 release as well as upgrade information, please see the current release notes at:

http://ganglia.wiki.sourceforge.net/ganglia_release_notes

Supported platforms:

  • Linux (Fedora/RedHat/CentOS, Debian, Gentoo, SuSE/OpenSuSE)
  • [Open]Solaris
  • FreeBSD
  • NetBSD
  • OpenBSD
  • DragonflyBSD
  • Cygwin (no support for DSO yet)
  • AIX (no support for DSO yet)

Please read all the README, INSTALL and other available documentation (http://ganglia.wiki.sourceforge.net) as a lot of things have changed since version 3.0. Use good deployment practices when upgrading from 3.0.x to make sure that you do not mix gmond 3.0 and 3.1 nodes in the same cluster (as defined by a multicast address or unicast collector node).  The protocol that allows gmond nodes to communicate within the same cluster, has changed.  However the XML packets that are passed between gmond and gmetad have remained compatible from 3.0.x to 3.1.x, allowing a 3.1.x gmetad to continue to pull data from an older 3.0.x gmond cluster.

Ganglia Development Team

,

No Comments

Ganglia 3.1.1 (Wien) Released

The Ganglia Project (http://ganglia.info) is pleased to announce the
official release of Ganglia 3.1.1  The official tarball is available for
immediate download at:

http://sourceforge.net/project/showfiles.php?group_id=43021&package_id=35280&release_id=625044

For a full description of the bug fixes and enhancements that are included
in the 3.1.1 release as well as upgrade information, please see the current
release notes at:

http://ganglia.wiki.sourceforge.net/ganglia_release_notes

Supported platforms:

  • Linux (Fedora/RedHat/CentOS, Debian, Gentoo, SuSE/OpenSuSE)
  • [Open]Solaris
  • FreeBSD
  • NetBSD
  • OpenBSD
  • DragonflyBSD
  • Cygwin (no support for DSO yet)
  • AIX (no support for DSO yet)

Please read all the README, INSTALL and other available documentation
(http://ganglia.wiki.sourceforge.net) as a lot of things have changed since
version 3.0. Use good deployment practices when upgrading from 3.0 to make sure
that you do not mix gmond 3.0 and 3.1 nodes in the same cluster (as defined by
a multicast address or unicast collector node).  The protocol that allows gmond
nodes to communicate within the same cluster, has changed.  However the XML
packets that are passed between gmond and gmetad have remained compatible from
3.0.x to 3.1.x, allowing a 3.1.x gmetad to continue to pull data from an older
3.0.x gmond cluster.

Ganglia Development Team

,

No Comments

Ganglia 3.1.0 (Amelia) Released

The Ganglia Project (http://ganglia.info) is pleased to announce the first
official release of Ganglia 3.1.0 The official tarball is available for
immediate download at:

http://sourceforge.net/project/showfiles.php?group_id=43021&package_id=35280&release_id=616721

Please refer to http://ganglia.wiki.sourceforge.net/ganglia_release_notes
for more information.

(There is a known bug with 3.1.0 gmetad aggregating XML data from another 3.1.0 gmetad — if your environment requires this feature, wait for 3.1.1 to be released. Alternatively, a patch has been developed and is available here)

The main features of this release are:

  • Introduction of a modular metric interface for C and Python (DSO support)
  • Scriptable metric module support with Python
  • All pre-existing metrics (CPU, network, disk, memory, etc.) converted to metric modules
  • Introduction of new metric modules multicpu, multidisk and tcp_conn status
  • Modular frontend graph support
  • Metric groups which can be viewed or hidden as desired
  • Additional scaling capacity for systems with memory greater than 4TB
  • Platform support for DragonFlyBSD
  • Improved native metric support for Windows (Built with CygWin)
  • Bug fixes and Enhancements

Supported platforms:

  • Linux (Fedora/RedHat/CentOS, Debian, Gentoo, SuSE/OpenSuSE)
  • [Open]Solaris
  • FreeBSD
  • NetBSD
  • OpenBSD
  • DragonflyBSD
  • Cygwin (no support for DSO yet)
  • AIX (no support for DSO yet)

Please read all the README, INSTALL and other available documentation
(http://ganglia.wiki.sourceforge.net) as a lot of things have changed since
3.0.7. Use good deployment practices when upgrading from 3.0.x to make sure
that you do not mix gmond 3.0 and 3.1 nodes in the same cluster (as defined
by a multicast address or unicast collector node). The protocol that
allows gmond nodes to communicate within the same cluster, has changed.
However the XML packets that are passed between gmond and gmetad have
remained compatible from 3.0.x to 3.1.x, allowing a 3.0.x gmetad to continue
to pull data from a newer 3.1.x gmond cluster.

For those who are interested in upgrading from a 3.0.x installation, your
current gmond and gmetad configuration files will need to be moved from their
current location to /etc/ganglia. If you are attempting the upgrade via an
RPM, the RPM will automatically move your current configuration file to the
new location. However, for gmond, the 3.0.x conf file will not work. Please
use the patch file gmond-3.1.patch available at
http://www.ganglia.info/releases/ to patch your gmond.conf prior to
starting, otherwise gmond will fail to startup.

There are several known issues with the current release which include the
following:

  • no support for C++ to create DSO modules
  • no spoofing from modular metrics (use gmetric if spoofing is needed)
  • race condition for tcpconn python metric module (affects gmond -m)
  • libdir issues related to building for 64bit platforms
  • known build issues for platforms:
    • Darwin (AKA MacOS/X)
    • HPUX
    • Tru64 (AKA OSF/1)
    • Irix

Many of the above issues are being addressed and patches will be applied for
the next minor release of Ganglia 3.1.x. In addition more information about
the current official release, can be found on the Ganglia wiki at
http://ganglia.wiki.sourceforge.net/ganglia_release_notes.

Ganglia Development Team

,

No Comments

Ganglia 3.0.7 (Fossett) Released

The Ganglia development team is pleased to announce the release of Ganglia 3.0.7 (Fossett) which is available for immediate download from:

http://sourceforge.net/project/showfiles.php?group_id=43021&package_id=35280&release_id=580140

This is a bugfix release which fixes bugs that were introduced in 3.0.6 as well as memory leaks in gmond.

Summary of bugfixes in 3.0.7:

  • [web] Host view metric graphs’ “now (x.xx)” number is always 0.00
  • [web] “Show Hosts” toggled did not work
  • [gmond] Fix memory leak from network metrics on Linux (thanks Kumar Vaibhav for reporting)
  • [gmond] Fix spoof memory leak (thanks Martin Hicks for the patch)

,

No Comments

Ganglia 3.0.6 (Foss) Released

The Ganglia development team is pleased to release Ganglia
3.0.6 (Foss) which is available for immediate download from:

http://sourceforge.net/project/showfiles.php?group_id=43021&package_id=35280

This release includes a security fix for web frontend
cross-scripting vulnerability.

All Ganglia web frontend users are strongly recommended to
upgrade to this version. In most cases the version of the
frontend does not need to match the version of gmetad and/or
gmond — if problem arises, please drop us a note at
ganglia-general@lists.sourceforge.net.

Special thanks to Romain Wartel at CERN for discovering the
vulnerability and reporting it to us and to Alex Dean for
stepping up with the fix so quickly.

,

No Comments

Ganglia 3.0.5 (Louis) Released

The Ganglia development team is proud to release version
3.0.5 (Louis) of the popular Ganglia monitoring software.
Ganglia is a scalable distributed monitoring system for
high-performance computing systems such as clusters and
Grids.

The latest release is available for immediate download from:
http://sourceforge.net/project/showfiles.php?group_id=43021&package_id=35280

This release has a few feature/portability enhancements as
well as the usual array of bugfixes.

Work is underway for the next (3.1.0) release of Ganglia
which will allow metrics to be dynamically loaded via DSO.

These metrics can be written either in C or Python making it
extremely easy to create plugins for monitoring metrics not
already present by default. Apr, expat and libconfuse will
be built dynamically in the new release which will make
packaging for distributions easier.

Changes:
The following is a summary of changes in this release. For
detailed changelog please refer to the ChangeLog file in the
release distribution tarball:

  • [gmetad] Fixed a bug where messages are being discarded in
    MacOSX and thus causing data from clients not being
    consistently and accurately saved to the rrd files (Mike
    Walker)
  • [win32] Include documentation (README.WIN) for building
    under Windows
  • [webfrontend] Enlarge graphs by clicking on them (Ulf)
  • [webfrontend] Include RRDTool version in frontend footer
    (Matthew Chambers)
  • [webfrontend] Only set the grid stack cookie if it hasn’t
    been set before (Matt Ryan)
  • [webfrontend] New feature to allow sorting by hosts up and
    hosts down in meta context (Bernard Li, Eli Stair, Timothy
    D Witham)
  • [gstat] New option “-n” to show numeric addresses instead
    of hostname (Bernard Li)
  • Builds under Yellow Dog Linux on Sony PlayStation 3 ppc64
    (Bernard Li)
  • Do not automatically start services (gmond, gmetad) after
    RPM installation (Bernard Li)
  • Add y-labels for some metrics. Needed to fix width of RRD
    images. (Martin Knoblauch)
  • Build system (Autotools) enhancements (Carlo Marcelo
    Arenas Belon)
  • Misc bug fixes

,

No Comments