Date: Thu, 31 Mar 2005 00:32:35 -0500 From: Sven Willenberger <sven@dmv.com> To: Vivek Khera <vivek@khera.org> Cc: freebsd-amd64@freebsd.org Subject: Re: Tyan k8sr lockups Message-ID: <424B8B73.8040100@dmv.com> In-Reply-To: <97964ce32490a368d64fa9b3500a8ba6@khera.org> References: <f0111a98c01333b3c306c81d10294de4@khera.org> <Pine.SOC.4.61.0503300229560.3181@tea.blinkenlights.nl> <97964ce32490a368d64fa9b3500a8ba6@khera.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Vivek Khera presumably uttered the following on 03/30/05 20:22: > > On Mar 29, 2005, at 7:39 PM, Sten Spans wrote: > >> There was an amr panic related to the management ioctls >> which was fixed and backported to RELENG_5. You should have >> this fix. amr controllers a supported quite well on freebsd >> thanks to Scott's great work. >> > > Yes, that was part of the reason I cvsup'd again last week... to ensure > I had the latest fixes to the amr driver. > >> The only way to get closer to solving these problems >> is to dig and try to narrow it down: >> >> - Have you tried running with debugging ? > > > any more than having the kernel debugger installed? When the box locked > up I couldn't even drop into the kernel debugger from the serial > console. neither the BREAK signal nor the alt key sequence invoked it. > >> >> - Have you tried using other network cards ? >> ( yeah that sucks I know ) > > > Nope. Machine is brand spanking new. Was in service a whole of 5 days > before it locked up. The twin of this machine also has issues with the > BIOS reporting "memory size changed" while the machine is running... so > I'm a bit concerned that there is some generic problem with the K8SR and > a megaraid controller. But that one never had any complaints about the > ethernet, and the memory size error persisted across two motherboards. > > I have yet to try the other ethernet port on this box as well. > >> >> - Are you absolutely sure that all the disks are working ? >> ( there have been reports of amr cards acting strange with >> silently failing disks ) > > > The megaraid bios showed all disks as active. How would one tell if you > had a silently failing disk? :-( > >> - Have you got the ufs fixes recently backported to releng_5 ? > > > If it was prior to March 22, then yes I have them. Where in cvsweb > might I look to test? > >> These are the first I can think of. RELENG_5 seems to be >> a bit of a moving target with some quite critical fixes >> going in ( which is good offcourse :). > > > Yes, it is good.... until you can't figure out if it is your hardware or > software flaking out on ya... > > Thanks so much for responding. > > For price-no-object, which vendor would you choose for an AMD system > today? Same question if price is somewhat of a concern. Thanks. > For a piece of anecdotal evidence we are running a dual opteron k8s pro with 8GB of RAM and the 320-2x Megaraid controller (which controls all the harddrives) Except for a problem with fxp (which I haven't gone back to reinvestigate but rather just used the Broadcom gigE instead) the system so far has run fairly stable (running Postgres under a medium load at the moment). We had one issue of a "spontaneous" reboot that left no indication of what happened in messages; this box is still in our testing area so it is possible that it was the result of a power[cord] issue. This is running 5.4-PRERELEASE from 17 March. Also, we disabled the onboard adaptec controllers in the bios (as we don't use them). Sven
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?424B8B73.8040100>