Thursday August 09, 2007 at 01:59
Subject: Heartbeat 2.0.2 with ipmilan STONITH.
Keywords:
Heartbeat, IPMI, STONITH, Technical
Posted by: Sean Reifschneider
I spent most of the day today trying to get IPMI STONITH working with
Heartbeat. IPMI is a system management protocol, usually implemented via
an auxiliary controller, for doing various management functions including
getting sensor data (fan speed, temp) and turning a server on and off. The
IPMI controller is on even if the system is otherwise powered off.
However, the ipmilan STONITH plugin is in pretty rough shape.
if you've gotten here via Google and are hoping this will help you get
IPMI set up on your cluster, let me cut to the chase: The ipmilan STONITH
driver appears to be completely unusable. You will probably have to do
what I'm doing and implement a STONITH external script that uses ipmitool
to do the job.
The first problem I ran into was that when I tried to set up STONITH
with ipmilan, according to the README, it would report:
(Post Reply)
CRITICAL **: Unable to setup connection: 16Google wasn't very helpful, it just pointed out someone else asking about this error from 18 months ago with no response... I dug into the code, and found that the "auth" and "priv" fields, which the documentation says accept values like "none", "md5", and "admin" are passed through the "atoi()" C library call to convert them into integers. Since none of the documented values are actually integer strings, they all silently get converted to 0. That is the core of the problem causing the error above. The "priv" field needs to be the integer 4 for "admin" in my case, but is instead 0. If you change the "priv" field to "4", and the "auth" field to "2" for "md5" it stops reporting the above error. However, it then starts core dumping due to an invalid pointer de-reference. The IPMI library is incredibly poorly documented, and to make it worse the STONITH ipmilan plugin is using a deprecated function. My opinion is that ipmilan needs to be scrapped and re-written, hopefully by someone who knows the OpenIPMI API or at least someone who can find some documentation on it. I was able to get ipmilan to reboot the remote machine, right before it seg faults, as well as correcting the argument passing problems above I've sent that patch to the Heartbeat maintainers, but I've also recommended to them that they either completely remove IPMI or at least disable it from the default build. I just wanted to get this up there where Google could find it so that other people could give up earlier than I did. :-(
(Post Reply)
| Comment |
Author:
Sean Reifschneider Subject: Oops, it was actually 2.1.2... |
Kevin pointed out that I said 2.0.2, when actually it was the latest heartbeat, version 2.1.2. However, 2.0.2 is almost certainly similarly impacted.
Sean
| Comment |
Steve Webb Subject: heartbeat with port monitor? |
Got any tips for using heartbeat with a port monitor? I'm trying to get a mysqld monitor working that tells heartbeat to switch on a failure. Ever done this?
- Steve
| Comment |
Fredrik Carlsson Subject: Stonith |
Is there any change that you will post the script you created?
| Comment |
Author:
Sean Reifschneider Subject: Sorry... |
Sorry, I just don't have the time to package, release, and maintain it. Between my other released software and our normal client work, I'm just swamped.
Sean
| Comment |
Allon Herman Subject: Thanks |
Thanks Sean,
I was just about to do an ltrace myself after having no success with strace. Anyway, using numeric values instead of symbolic names for auth and priv, seems to have solved my problem!
I was just about to do an ltrace myself after having no success with strace. Anyway, using numeric values instead of symbolic names for auth and priv, seems to have solved my problem!
| Comment |
Allon Herman Subject: more about strange behavior |
Another strange thing about ipmilan's behavior is that stonith with -T on turns the system off, and with -T off turns the systems on...
It seg faults in all cases, but only after the good deed is done, so for the time being, I'll live with it.