From owner-freebsd-stable Fri Apr 21 10:27:18 2000 Delivered-To: freebsd-stable@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 3E84D37B618; Fri, 21 Apr 2000 10:27:15 -0700 (PDT) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id e3LHtdZ15673; Fri, 21 Apr 2000 10:55:39 -0700 (PDT) Date: Fri, 21 Apr 2000 10:55:39 -0700 From: Alfred Perlstein To: Mike Smith Cc: stable@FreeBSD.ORG Subject: Re: amr still seems to have issues. Message-ID: <20000421105539.C10782@fw.wintelcom.net> References: <20000421003205.A25458@fw.wintelcom.net> <200004211644.JAA02335@mass.cdrom.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <200004211644.JAA02335@mass.cdrom.com>; from msmith@FreeBSD.ORG on Fri, Apr 21, 2000 at 09:44:35AM -0700 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG * Mike Smith [000421 10:07] wrote: > > * Mike Smith [000420 11:39] wrote: > > > > Hi, we're running 4.0-stable as of Sat Apr 15 18:39:08 PDT 2000 > > > > which include the recent amr fixes which we were hoping would cure > > > > the lockups with amr. Unfortunatly we are now experiancing reboots, > > > > the messages file reveals this: > > > > > > > > Apr 15 13:31:06 abacus /kernel: amr0: command 31 wedged after 30 seconds > > > > > > This is extra-bad. Without more feedback from the controller (no > > > documentation from AMI yet, sorry. 8() I can only wonder whether you're > > > getting a SCSI bus error of some sort that's causing the kernel to time > > > these commands out (because the controller is taking too long to respond). > > > > > > You could try increasing the timeout allowance in amr_periodic(), or just > > > disable the poll entirely. This won't help if the controller is really > > > dropping commands, though. > > > > > > > Right now I'm attempting to log off a serial console to see what's > > > > going on, however this box has been in production (and doing miserably) > > > > for some time now so doing debugging is pretty difficult as well as > > > > time consuming where I really need to be working on other issues. > > > > > > At this point, I have no other ideas, sorry. > > > > Here's something I hope it helps: > > Hmm. Looks like I'm not retiring the wedged command correctly. This is > a symptom, rather than the real problem, though. See if disabling the > command timeout stuff makes the system happy - either these commands are > just taking a _long_ time to complete, or you have another problem. Yes, I've been thinking I may need some firmware upgrade from AMI? I'll look into it. > I _still_ think you have disk, cable or enclosure issues, but now that > you've precipitated this case I can go look at what I'm doing wrong here. > Thanks! Yes this is also a possiblity, the really annoying this is that I can totally kill the controller and nothing happens, but if i leave it alone for any amount of time it seems to blow up on it's own. Maybe I need to get on the phone with thier support people and see if they have any common trouble shooting tips... -Alfred To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message