From owner-freebsd-stable Sun Apr 22 7:35: 5 2001 Delivered-To: freebsd-stable@freebsd.org Received: from bjorn.goddamnbastard.org (c1283020-a.hrvy1.il.home.com [24.183.37.152]) by hub.freebsd.org (Postfix) with SMTP id F101437B42C for ; Sun, 22 Apr 2001 07:34:49 -0700 (PDT) (envelope-from ryanb@bjorn.goddamnbastard.org) Received: (qmail 18114 invoked by uid 1000); 22 Apr 2001 14:34:49 -0000 Date: Sun, 22 Apr 2001 09:34:49 -0500 From: ryan beasley To: Mike Smith Cc: freebsd-stable@freebsd.org Subject: Re: AMI MegaRAID (428 series; Enterprise 1200?) + 4-STABLE (2001.12.14) -> hard lock w/o ability to dump Message-ID: <20010422093448.A12688@bjorn.goddamnbastard.org> References: <20010420120801.B9227@bjorn.goddamnbastard.org> <200104202056.f3KKuKf02398@mass.dis.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.4i In-Reply-To: <200104202056.f3KKuKf02398@mass.dis.org>; from msmith@freebsd.org on Fri, Apr 20, 2001 at 01:56:20PM -0700 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Fri, Apr 20, 2001 at 01:56:20PM -0700, Mike Smith wrote: > > It does sound like you have a filesystem lock cascade going on, which > would be explained by a lost I/O. There are other possibilities though. > At this point, I'm open to the idea of divine intervention. (Well, not really, but .... ;) > > No, you're on the wrong track here; the driver doesn't have a dump > routine, so you can't dump to it under any circumstances. > Thanks much for the correction. That'll definitely school me for not looking at a shred of sources before posting. I was never aware that such drivers (without dump routines) existed. > You need to update your firmware first; UF82-166 is what you want. It's > possible that this isn't the problem, but I can't offer you any help until > you do upgrade, as I don't have a defect listing for AMI's firmware. I upgraded from uc77 to uf82 Friday evening ~6pm CST. I talked to AMI support shortly afterwards and found out that uc77 wasn't one of their specific firmware revs; that release must've been a Dell release. (AMI support guy said that AMI firmware revs, at least for this card, only start as "us" or "uf", not "uc".) ... anywho, it's now at amr0: port 0xe480-0xe4ff irq 18 at device 10.0 on pci2 amr0: Firmware UF82, BIOS 1.66, 128MB RAM I woke up to a page ~5am CST today to find it in the same locked state. I called a panic and the machine was back in a few minutes. As it stands now, I'm looking at the possibilities of the RAID controller itself or the RAM installed on it; I'm hoping the driver's OK. (There would probably be a lot more posts about this if said driver wasn't.) It's just odd, coming from the I-don't-really-know-that-much-about- controllers- and-their-related-drivers community, that no errors are sent to the console of a SCSI timeout considering that the kernel itself is still flying high. (I must re-emphasize my lack of understanding for the kernel and I/O; too many fires to fight -> not much time for research.) Current plan: I'm going to take a look inside the machine Monday and look at the possibility of pulling out the CD-ROM drive and replacing it with a decent 4G disk attached to an on-board Adaptec controller. I'm hoping that this will give me the chance to generate some useful crash dumps. /me thanks both Mike and this list for collective time stolen for reading/thinking/responding purposes. - ryan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message