Date: Tue, 28 Feb 2006 16:38:03 -0600 From: Mike Holloway <mikhollo@cisco.com> To: freebsd-proliant@freebsd.org Subject: hpasmcli locks up a DL380G3 Message-ID: <A26FFDF4-5C23-4C09-8934-73914CD0F331@cisco.com>
next in thread | raw e-mail | index | archive | help
>> Hi! >> >> Sorry for being late on this one, found this browsing around. >> >> Yes, I have had ONE machine lock up on me once. >> And older HP Proliant DL380G1 UP. Just as you describe, it had been >> working great for >> a couple of weeks, then suddenly when starting hpasmcli it froze. >> Couldn't even ping the machine. >> >> This particualar machine really is not doing anything, and as I belive >> it still is running (Moved/changed job) and I could probably >> recreate the lockup. >> I still have access to this machine, so if anyone want me to try >> something, I can do it. >> The machine is 600km away from me now, so if lockup occurs it can >> take some time to get it >> powercycled though. >> >> Oh, 5.3 or 5.4 as I recall. >> >> Have you seen any other lockup Greg? > >I haven't tempted fate that way yet. I always restart the hpasmd >before using the client on a machine. This seems to avoid the problem. > >Thanks for responding to my mail, you're the third person to confirm >the problem, which given that it locks the machine up hard, is a very >serious one. > >best. >greg. Besides the hpasmcli tool hanging just after the banner message, I've also experienced reboots caused by hpasmd, and have had to remove it completely from my test lab servers. I was able to find a scenario which would invariably cause the servers to reboot, I had hpasmd running on approximately 20 HP DL380 G4 servers all running the same customized FreeBSD 6.0 release kernel on x86 (intel xeon). All machines were configured to run hpasmcli -s "show temps;" every 5 minutes, within a perl wrapper around hpasmcli (included below) which would kill the perl wrapper process (and so hpasmcli) via an ALARM signal if hpasmcli didn't exit within 45 seconds. Within a few hours, a few machines would show the hpasmcli tool hanging and only displaying the banner message. Cron was continuing to run-and-kill the hung hpasmcli tool every 5 minutes for some period of hours before I would notice. After commenting out the cron job and verifying that no hpasmcli processes existed, I could then stop hpasmd via the init script, which sends a TERM signal to the process followed by a KILL signal a couple of seconds later. Without exception those servers would spontaneously reboot a few minutes (2-5) later. On servers that the hpasmcli tool hadn't yet hung, I could stop hpasmd with no ill effects to the system. John, are you still working on this very useful tool? I can provide access to a DL380 G4 if you need a platform to test on. -mike #!/usr/bin/perl eval { local $SIG{ALRM} = sub { local $SIG{HUP} = 'IGNORE'; kill 1,(-$$); }; alarm 45; system ("/usr/sbin/hpasmcli -s \"show temps;\""); alarm 0; }; $SIG{HUP} = 'DEFAULT'; exit 0;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A26FFDF4-5C23-4C09-8934-73914CD0F331>