From owner-aic7xxx Thu May 17 1:18:28 2001 Delivered-To: aic7xxx@freebsd.org Received: from hmljs.rzs-hm.si (hmljs.rzs-hm.si [193.2.208.10]) by hub.freebsd.org (Postfix) with ESMTP id 6BFC137B424 for ; Thu, 17 May 2001 01:18:22 -0700 (PDT) (envelope-from metod.kozelj@rzs-hm.si) Received: from hmljhp.rzs-hm.si (hmljhp.rzs-hm.si [193.2.208.12]) by rzs-hm.si (PMDF V5.2-31 #39364) with ESMTP id <01K3NWL9F666000D7J@rzs-hm.si> for aic7xxx@FreeBSD.ORG; Thu, 17 May 2001 08:18:19 GMT Received: from localhost by hmljhp with SMTP (8.7.1/8.7.1) id KAA07462; Thu, 17 May 2001 10:18:18 +0200 (CEST) Date: Thu, 17 May 2001 10:18:18 +0200 (CEST) From: Metod Kozelj Subject: Re: Weird problem In-reply-to: <200105161445.f4GEjgU68882@aslan.scsiguy.com> To: "Justin T. Gibbs" Cc: aic7xxx@FreeBSD.ORG Message-id: MIME-version: 1.0 Content-type: TEXT/PLAIN; charset=US-ASCII Sender: owner-aic7xxx@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hello, On Wed, 16 May 2001, Justin T. Gibbs wrote: > >A short question: with AIC7xxx v6.1.13 / linux v2.4.4, is it possible to > >limit Tagged opennings/Max command opennings per device? > > You can with a LILO option. If you search the comments about LILO > options in drivers/scsi/aic7xxx/aic7xxx_linux.c, you'll see how > to do it. OK, this works nicely (even with aboot :) > Does the driver report any messages to the console? If so, run with > "aic7xxx=verbose" and send me the messages. You may need to setup a > serial console to acurately capture the messages. The output from different tries was different every time: try 1 (without tag_info parameter): "Unable to handle kernel paging request". After that, system freezez. Note: indeed there's a swap partition on the offending device, however it's not the only one and has the lowest priority. The test was done on idle machine, so only some little swapping was done to the other scsi device. I guess the above message has something to do with locked SCSI bus. try 2 (the same settings as above) No console output at all try 3 (with tag_info:{{253,16}}) No extensive testing, but system seems to work fine Note: the problematic device is scsi0:0:1:0, hence the tag_info setting. try 4 (with tag_info:{{253,32}}) System operates just fine try 5 (with tag_info:{{253,62}}) Some registers dump (some of it scrolls off the serial terminal), the bottom of it says: Trace: 8cccfc 8cc1d8 815f0c 819de4 816510 810ab8 810c64 xxxxxx 8269fc 810c64 Kernel panic: Aiee, killing interrupt handler In interrupt handler - not syncing Note: I'm not a kernel hacker, so I don't really know what to do with this Note2: the xxxxxx value from the Trace is not really xxxxxx, it was a hex value I wrote down wrongly try 6 (again with tag_info:{{253,62}}) No output at all, just system freeze try 7 (with tag_info:{{253,48}}) There's a lot of activity on console after SCSI chokes. Some of it is like this: . . . DevQ(0:0:0): 0 waiting DevQ(0:1:0): 0 waiting DevQ(0:4:0): 0 waiting Recovery SCB completes (scsi0:A:1:0): Queuing a recovery SCB scsi0:0:1:0: Device is disconnected, re-queuing SCB Recovery code sleeping Recovery code awake aic7xxx_abort returns 8194 scsi0:0:1:0: Attempting to queue ABORT message scsi0: PCI error Interrupt at seqaddr = 0x7e scsi0: Received a Target Abort scsi0: Dumping Card state in Data-out phase, at SEQADDR 0x7e SCSISEQ = 0x12, SBLKCTL = 0x2 DFCNTRL = 0x3c, DFSTATUS = 0x71 LASTPHASE = 0x0, SCSISIGI = 0x14, SXFRCTL0 = 0x80 SSTAT0 = 0x0, SSTAT1 = 0x3 STACK == 0x0, 0x15d, 0x18d, 0x70 SCB count = 112 Kernel NEXTQSCB = 49 Card NEXTQSCB = 60 QINFIFO entries: 60 15 Waiting Queue entries: Disconnected Queue entries: 0:70 12:53 QOUTFIFO entries: Sequencer Free SCB List: 6 3 8 5 10 9 13 7 11 14 2 1 4 Pending list: 15 10 8 2 70 53 60 56 Kernel Free SCB list: ........... . . . Again, I couldn't write down all of it ... I'm not sure it's really a problem of aic7xxx driver that this device misbehaves. It just might be the firmware. This hard disk has a long history here, but I guess it hasn't been under such a stress so far. First it was used in AlphaStation 600 5/333, hooked up to a "Q Logic ISP1020 (rev 02)", running DigitalUnix v3.2. Then it was used in current machine, however with older Linux kernel and aic7xxx driver of previous generation with max tagged commands set maximum to 16. So, the problem seems to be cured by setting the maximum command openings to something as low as 32 (I'll stress the system with this setting to see, if it really helps). If there are some other tests to be done, I'd be glad to help. Regards, Metod Metod Kozelj mailto:Metod.Kozelj@rzs-hm.si /\ Ne posiljajte mi smeti ker grizem! http://www.rzs-hm.si/ / \ Don't spam me for I bite! _______________________________________/ \__________________________________ ---- perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);' To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe aic7xxx" in the body of the message