Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Mar 2005 02:39:24 +0200 (CEST)
From:      Sten Spans <sten@blinkenlights.nl>
To:        Vivek Khera <vivek@khera.org>
Cc:        freebsd-amd64@freebsd.org
Subject:   Re: Tyan k8sr lockups
Message-ID:  <Pine.SOC.4.61.0503300229560.3181@tea.blinkenlights.nl>
In-Reply-To: <f0111a98c01333b3c306c81d10294de4@khera.org>
References:  <f0111a98c01333b3c306c81d10294de4@khera.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 29 Mar 2005, Vivek Khera wrote:

> I have a brand new Tyan k8sr based system with a megaraid 320-2X controller 
> in it.
>
> This is my second copy of this box (the first one is being replaced since it 
> keeps locking up and reporting "memory size changed" in the BIOS, even with a 
> new motherboard)...
>
> Anynow, this particular machine has just simply locked up with no errors 
> reported to console or syslog or BIOS.  It does have a newer BIOS rev, so 
> perhaps that is why nothing is logged to BIOS...  This happens during times 
> of heavy loads (large database reports, database dump, database replication 
> all simultaneously running).  If the machine doesn't crash, I'll see a "bge0 
> timeout -- resetting" in the syslog most days during the time of heavy load 
> (reports run on a remote client).
>
> I see at http://unix.derkeiler.com/Mailing-Lists/FreeBSD/hackers/2005-03/ 
> 0419.html that this is happening to at least one other person on this 
> motherboard, but using the on-board controller.
>
> Someone else just noted similar lockups with 5.3 with SMP running mysql, and 
> about 2 weeks ago there was a discussion about amr driver panic under 
> 5.4-PRERELEASE, both on the stable@ list.  However these were not amd64 
> systems, if that matters.

There was an amr panic related to the management ioctls
which was fixed and backported to RELENG_5. You should have
this fix. amr controllers a supported quite well on freebsd
thanks to Scott's great work.

The only way to get closer to solving these problems
is to dig and try to narrow it down:

- Have you tried running with debugging ?
- Have you tried using other network cards ?
   ( yeah that sucks I know )
- Are you absolutely sure that all the disks are working ?
   ( there have been reports of amr cards acting strange with
     silently failing disks )
- Have you got the ufs fixes recently backported to releng_5 ?

These are the first I can think of. RELENG_5 seems to be
a bit of a moving target with some quite critical fixes
going in ( which is good offcourse :).

HTH, HAND
-- 
Sten Spans

"There is a crack in everything, that's how the light gets in."
Leonard Cohen - Anthem



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SOC.4.61.0503300229560.3181>