Date: Wed, 30 Mar 2005 02:39:24 +0200 (CEST) From: Sten Spans <sten@blinkenlights.nl> To: Vivek Khera <vivek@khera.org> Cc: freebsd-amd64@freebsd.org Subject: Re: Tyan k8sr lockups Message-ID: <Pine.SOC.4.61.0503300229560.3181@tea.blinkenlights.nl> In-Reply-To: <f0111a98c01333b3c306c81d10294de4@khera.org> References: <f0111a98c01333b3c306c81d10294de4@khera.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 29 Mar 2005, Vivek Khera wrote: > I have a brand new Tyan k8sr based system with a megaraid 320-2X controller > in it. > > This is my second copy of this box (the first one is being replaced since it > keeps locking up and reporting "memory size changed" in the BIOS, even with a > new motherboard)... > > Anynow, this particular machine has just simply locked up with no errors > reported to console or syslog or BIOS. It does have a newer BIOS rev, so > perhaps that is why nothing is logged to BIOS... This happens during times > of heavy loads (large database reports, database dump, database replication > all simultaneously running). If the machine doesn't crash, I'll see a "bge0 > timeout -- resetting" in the syslog most days during the time of heavy load > (reports run on a remote client). > > I see at http://unix.derkeiler.com/Mailing-Lists/FreeBSD/hackers/2005-03/ > 0419.html that this is happening to at least one other person on this > motherboard, but using the on-board controller. > > Someone else just noted similar lockups with 5.3 with SMP running mysql, and > about 2 weeks ago there was a discussion about amr driver panic under > 5.4-PRERELEASE, both on the stable@ list. However these were not amd64 > systems, if that matters. There was an amr panic related to the management ioctls which was fixed and backported to RELENG_5. You should have this fix. amr controllers a supported quite well on freebsd thanks to Scott's great work. The only way to get closer to solving these problems is to dig and try to narrow it down: - Have you tried running with debugging ? - Have you tried using other network cards ? ( yeah that sucks I know ) - Are you absolutely sure that all the disks are working ? ( there have been reports of amr cards acting strange with silently failing disks ) - Have you got the ufs fixes recently backported to releng_5 ? These are the first I can think of. RELENG_5 seems to be a bit of a moving target with some quite critical fixes going in ( which is good offcourse :). HTH, HAND -- Sten Spans "There is a crack in everything, that's how the light gets in." Leonard Cohen - Anthem
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SOC.4.61.0503300229560.3181>