Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Sep 2006 18:13:29 +0200 (CEST)
From:      kama <kama@pvp.se>
To:        Per olof Ljungmark <peo@intersonic.se>
Cc:        freebsd-proliant@freebsd.org
Subject:   Re: DL380G2 instability
Message-ID:  <20060925174144.R39281@ns1.as.pvp.se>
In-Reply-To: <4517DCE3.2090206@intersonic.se>
References:  <4517C9AE.6000107@intersonic.se> <20060925132626.GG26539@voodoo.schug.net> <4517DCE3.2090206@intersonic.se>

next in thread | previous in thread | raw e-mail | index | archive | help


On Mon, 25 Sep 2006, Per olof Ljungmark wrote:

> Christoph Schug wrote:
> > On Mon, Sep 25, 2006, Per olof Ljungmark wrote:
> >
> >> I have tried now for some time to get FBSD running on a couple of
> >> DL380G2 with dual PIII 1.4G but never got it right. The boxes are
> >> identical in both hw setup and the way they behave so I figured this may
> >> be FreeBSD related, not a hardware fault.
> > [...]
> >> Anyone out there who has this model running SMP OK?
> >
> > I've got several DL380 G2 running 6-STABLE and 6.1-RELEASE without any
> > hitch, both in UP and SMP configurations. ACPI is disabled, OS set to
> > 'Linux'. Other BIOS settings are default, firmware levels are as of
> > Proliant Software Maintenance CD 7.40.
> >
> > I would rather guess you encountered a hardware issue (faulty power
> > supply; non-identical stepping of CPUs?). OTOH as you're talking about a
> > couple of machines, this is not very likely. Are you sure, your servers
> > have got the redundant fan kit in place as this is mandatory for SMP
> > operations?
>
> Hi Christoph,
>
> Yes, got the fan kit and CPU's are same stepping and it is two identical
> boxes. As most people seem to be ok with "Linux" as the BIOS setting
> this is what I have used for most of the time. Firmware levels are at
> 7.50 but I don't think there are any updates for this box between 7.40
> and .50. Have even switched power supplies, both boxes have redundant psu's.
>
> I guess I should try 5-STABLE SMP, if that works the probability of hw
> fault would be minimal but then again I'm back to square one in terms of
> what the problem could be.

The reason of checking 5-STABLE, is that I have encounter problems w DL380
and 6-STABLE (mine are all G3's). But I could not get any real info out of
it and could not proceed to make a report. Mine just rebooted (power
cycled), like if someone pulled the powercables and put them in again. So
nothing in the log, and no dump in /var/crash (Yes, I specified it.).
The ilo only reports a power cycle.

What I do know is that it happens more often with high IO load. Network of
200-300Mbps in and out, a lot of memory and disk activity. One other thing
I noticed was that they mostly occured at 00, 15, 30 or 45. (according to
logs and ilo) But I could not see anything in cron, logrotate or anything
that could cause it to boot at those times. I disabled the checks for
server overload in the bios settings and still the power cycled. But it
couldnt have been a problem with the hardware, since with 5-STABLE it can
run for months without problems and with 6-STABLE it can crash anywhere
from 2 hours to a week. Sometimes it can crash several times in one day
and then be stable and fine for a week.

If you get a solid 5-STABLE and the 6-STABLE is still a problem, it will
be one more than me (that I know of) that have encounter problems. As my
boxes are in production, I cannot switch to 6-STABLE just to test things.
So if you are able to do all the steps that needs to be done and post a
problem report, it would be great, from my point of view. And then I can
only hope that it is the same issue.

/Bjorn



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060925174144.R39281>