Date: Sun, 22 Apr 2001 09:34:49 -0500 From: ryan beasley <ryanb@goddamnbastard.org> To: Mike Smith <msmith@freebsd.org> Cc: freebsd-stable@freebsd.org Subject: Re: AMI MegaRAID (428 series; Enterprise 1200?) + 4-STABLE (2001.12.14) -> hard lock w/o ability to dump Message-ID: <20010422093448.A12688@bjorn.goddamnbastard.org> In-Reply-To: <200104202056.f3KKuKf02398@mass.dis.org>; from msmith@freebsd.org on Fri, Apr 20, 2001 at 01:56:20PM -0700 References: <20010420120801.B9227@bjorn.goddamnbastard.org> <200104202056.f3KKuKf02398@mass.dis.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 20, 2001 at 01:56:20PM -0700, Mike Smith wrote: > > It does sound like you have a filesystem lock cascade going on, which > would be explained by a lost I/O. There are other possibilities though. > At this point, I'm open to the idea of divine intervention. (Well, not really, but .... ;) > > No, you're on the wrong track here; the driver doesn't have a dump > routine, so you can't dump to it under any circumstances. > Thanks much for the correction. That'll definitely school me for not looking at a shred of sources before posting. I was never aware that such drivers (without dump routines) existed. > You need to update your firmware first; UF82-166 is what you want. It's > possible that this isn't the problem, but I can't offer you any help until > you do upgrade, as I don't have a defect listing for AMI's firmware. I upgraded from uc77 to uf82 Friday evening ~6pm CST. I talked to AMI support shortly afterwards and found out that uc77 wasn't one of their specific firmware revs; that release must've been a Dell release. (AMI support guy said that AMI firmware revs, at least for this card, only start as "us" or "uf", not "uc".) ... anywho, it's now at amr0: <AMI MegaRAID> port 0xe480-0xe4ff irq 18 at device 10.0 on pci2 amr0: <Series 428> Firmware UF82, BIOS 1.66, 128MB RAM I woke up to a page ~5am CST today to find it in the same locked state. I called a panic and the machine was back in a few minutes. As it stands now, I'm looking at the possibilities of the RAID controller itself or the RAM installed on it; I'm hoping the driver's OK. (There would probably be a lot more posts about this if said driver wasn't.) It's just odd, coming from the I-don't-really-know-that-much-about- controllers- and-their-related-drivers community, that no errors are sent to the console of a SCSI timeout considering that the kernel itself is still flying high. (I must re-emphasize my lack of understanding for the kernel and I/O; too many fires to fight -> not much time for research.) Current plan: I'm going to take a look inside the machine Monday and look at the possibility of pulling out the CD-ROM drive and replacing it with a decent 4G disk attached to an on-board Adaptec controller. I'm hoping that this will give me the chance to generate some useful crash dumps. <grin> /me thanks both Mike and this list for collective time stolen for reading/thinking/responding purposes. - ryan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010422093448.A12688>