Date: Thu, 19 Apr 2007 12:01:10 -0400 From: Bill Moran <wmoran@potentialtech.com> To: Dimitris Zilaskos <dzila@tassadar.physics.auth.gr> Cc: freebsd-questions@freebsd.org Subject: Re: random hangs/reboots with Dell servers Message-ID: <20070419120110.b69c6213.wmoran@potentialtech.com> In-Reply-To: <6.0.0.22.2.20070419103724.025a8fd8@mail.computinginnovations.com> References: <Pine.LNX.4.64.0704191333000.7897@tassadar.physics.auth.gr> <6.0.0.22.2.20070419103724.025a8fd8@mail.computinginnovations.com>
next in thread | previous in thread | raw e-mail | index | archive | help
In response to Derek Ragona <derek@computinginnovations.com>: > At 05:54 AM 4/19/2007, Dimitris Zilaskos wrote: > > > Dear all, > > > >I am trying to understand some long standing issues we have with freebsd > >and Dell servers. > > > >Over the last 3 year we have installed freebsd 5.x and 6.x, with currently > >deployed version being 6.1, to a variety of of Dell rack mounted systems. > > > >The Dell systems used so far are Poweredge 1750, 2950 (both scsi), and > >sc1425 (sata). All of them are dual CPU Xeon systems. > > > >All these systems serve as mail/web servers, with 2 to 15 jails. > > > >Installation has always proceeded normally without problems. However, > >after a few months of operation, all of these systems, purchased at > >different moments during the last 3 years, will begin rebooting randomly > >or freezing completely. > > > >These reboots/freezes will at first occur once per 6 months, then > >gradually will move to to once per month, to normally stabilize around > >once per week, but in the case of the 1750 system once it even happened > >twice a day. > > > >Load does not seem to matter, since even after shutting down all services > >in the servers, still random reboots occurred. > > > >So far we tried various tricks digged from the archives, like disabling > >ACPI, HT, but nothing changed. > > > >We have migrated some systems that had these issues to RHEL compatible OS, > >and they run rock solid under heavy load. > > > >Right now I have enabled kernel crash dumps and I am waiting for the next > >crash. But I understand a lot of people use FreeBSD with Dell servers, and > >I would like to listen on how to tackle this situation we are facing. Sorry, I missed the original post on this. We run a variety of Dell stuff where I work. Lots of 1850 and 2850 units, and some 1950 and 2950s, in addition to a few 850s and the like. We're not having any problems. We routinely see uptimes that span from one maintenance window to the next without any unplanned reboots. One thing we've had fun with is that Dell has issued a LOT of firmware upgrades over the last year, and those are a pain to get applied to remote systems. However, I don't recall any stability problems prior to the upgrades. I know this isn't answering your question, but I thought I'd point out that your experience is not typical. Somewhere, you are having a problem that _can_ be solved. The various units you describe come in various configurations, I wonder if you're picking a specific hardware combination that FreeBSD has trouble with? Otherwise, you're on the right path with the crash dumps. Once you have more details, post them to this or the -hackers list to see if you can get the problem narrowed down. However, if these systems are spontaneously rebooting without a panic, crash dumps might not help. You don't have IPMI enabled on a public interface, do you? -- Bill Moran http://www.potentialtech.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070419120110.b69c6213.wmoran>
