Archive for September, 2004
People who use gexec and pcp on the latest Linux kernels will find that it hangs when executed. The problem is that Linux 2.4.x doesn’t
implement the full set of POSIX cancelation points (e.g., sem_wait,
sigwait, etc. are not implemented). This, it turns out, is the
fundamental cause for GEXEC and PCP hanging on these systems. Also,
terminal related signals (e.g., SIGTTIN) don’t appear to handled
correctly. I’m told that in 2.6.x kernels, some of these problems
have been fixed. But in the meantime, set your LD_ASSUME_KERNEL environmental variable before you start gexec daemons or clients.
In the future most (if not all) ganglia components will not rely on POSIX threads at all given the chaotic nature of threads on Linux.