Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Mar 2005 00:32:35 -0500
From:      Sven Willenberger <sven@dmv.com>
To:        Vivek Khera <vivek@khera.org>
Cc:        freebsd-amd64@freebsd.org
Subject:   Re: Tyan k8sr lockups
Message-ID:  <424B8B73.8040100@dmv.com>
In-Reply-To: <97964ce32490a368d64fa9b3500a8ba6@khera.org>
References:  <f0111a98c01333b3c306c81d10294de4@khera.org> <Pine.SOC.4.61.0503300229560.3181@tea.blinkenlights.nl> <97964ce32490a368d64fa9b3500a8ba6@khera.org>

next in thread | previous in thread | raw e-mail | index | archive | help


Vivek Khera presumably uttered the following on 03/30/05 20:22:
> 
> On Mar 29, 2005, at 7:39 PM, Sten Spans wrote:
> 
>> There was an amr panic related to the management ioctls
>> which was fixed and backported to RELENG_5. You should have
>> this fix. amr controllers a supported quite well on freebsd
>> thanks to Scott's great work.
>>
> 
> Yes, that was part of the reason I cvsup'd again last week... to ensure 
> I had the latest fixes to the amr driver.
> 
>> The only way to get closer to solving these problems
>> is to dig and try to narrow it down:
>>
>> - Have you tried running with debugging ?
> 
> 
> any more than having the kernel debugger installed?  When the box locked 
> up I couldn't even drop into the kernel debugger from the serial 
> console. neither the BREAK signal nor the alt key sequence invoked it.
> 
>>
>> - Have you tried using other network cards ?
>>   ( yeah that sucks I know )
> 
> 
> Nope.  Machine is brand spanking new.  Was in service a whole of 5 days 
> before it locked up.  The twin of this machine also has issues with the 
> BIOS reporting "memory size changed" while the machine is running... so 
> I'm a bit concerned that there is some generic problem with the K8SR and 
> a megaraid controller.  But that one never had any complaints about the 
> ethernet, and the memory size error persisted across two motherboards.
> 
> I have yet to try the other ethernet port on this box as well.
> 
>>
>> - Are you absolutely sure that all the disks are working ?
>>   ( there have been reports of amr cards acting strange with
>>     silently failing disks )
> 
> 
> The megaraid bios showed all disks as active.  How would one tell if you 
> had a silently failing disk? :-(
> 
>> - Have you got the ufs fixes recently backported to releng_5 ?
> 
> 
> If it was prior to March 22, then yes I have them.  Where in cvsweb 
> might I look to test?
> 
>> These are the first I can think of. RELENG_5 seems to be
>> a bit of a moving target with some quite critical fixes
>> going in ( which is good offcourse :).
> 
> 
> Yes, it is good.... until you can't figure out if it is your hardware or 
> software flaking out on ya...
> 
> Thanks so much for responding.
> 
> For price-no-object, which vendor would you choose for an AMD system 
> today?  Same question if price is somewhat of a concern.  Thanks.
> 

For a piece of anecdotal evidence we are running a dual opteron k8s pro 
with 8GB of RAM and the 320-2x Megaraid controller (which controls all 
the harddrives) Except for a problem with fxp (which I haven't gone back 
to reinvestigate but rather just used the Broadcom gigE instead) the 
system so far has run fairly stable (running Postgres under a medium 
load at the moment). We had one issue of a "spontaneous" reboot that 
left no indication of what happened in messages; this box is still in 
our testing area so it is possible that it was the result of a 
power[cord] issue. This is running 5.4-PRERELEASE from 17 March. Also, 
we disabled the onboard adaptec controllers in the bios (as we don't use 
them).

Sven



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?424B8B73.8040100>