Date: Thu, 11 Aug 2011 02:28:58 -0700 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Steven Hartland <killing@multiplay.co.uk> Cc: Attilio Rao <attilio@freebsd.org>, freebsd-stable@freebsd.org, Andriy Gapon <avg@freebsd.org> Subject: Re: debugging frequent kernel panics on 8.2-RELEASE Message-ID: <20110811092858.GA94514@icarus.home.lan> In-Reply-To: <44DD20E1CFA949E8A1B15B3847769DCB@multiplay.co.uk> References: <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk> <A71C3ACF01EC4D36871E49805C1A5321@multiplay.co.uk> <4E4380C0.7070908@FreeBSD.org> <CAJ-FndAq2ASHzg_%2B9S__x=vTAgzHowMrv1DFSbXwroX27PF36A@mail.gmail.com> <44DD20E1CFA949E8A1B15B3847769DCB@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 11, 2011 at 09:59:36AM +0100, Steven Hartland wrote: > That's not the issue as its happening across board over 130 machines :( Agreed, bad hardware sounds unlikely here. I could believe some strange incompatibility (e.g. BIOS quirk or the like[1]) that might cause problems en masse across many servers, but hardware issues are unlikely in this situation. [1]: I mention this because we had something similar happen at my workplace. For months we used a specific model of system from our vendor which worked reliably, zero issues. Then we got a new shipment of boxes (same model as prior) which started acting very odd (often AHCI timeout issues or MCEs which when decoded would usually turn out to be nonsensical). It took weeks to determine the cause given how slow the vendor was to respond: root cause turned out to be that the vendor decided, on a whim, to start shipping a newer BIOS version which wasn't "as compatible" with Solaris as previous BIOSes. Downgrading all the systems to the older BIOS fixed the problem. In Steve's case this is unlikely to be the situation, but I thought I'd share the story anyway. "SKU ABCXYZ-1" from August 2009 is not necessarily the same thing as "SKU ABCXYZ-1" from May 2010. ;-) This is also why I prefer to buy/build my own systems, since I cannot trust vendors to not mess about with settings w/out changing SKUs, P/Ns, or revision numbers. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110811092858.GA94514>