Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 May 2001 10:18:18 +0200 (CEST)
From:      Metod Kozelj <metod.kozelj@rzs-hm.si>
To:        "Justin T. Gibbs" <gibbs@scsiguy.com>
Cc:        aic7xxx@FreeBSD.ORG
Subject:   Re: Weird problem
Message-ID:  <Pine.HPP.3.96.1010517094603.6988A-100000@hmljhp.rzs-hm.si>
In-Reply-To: <200105161445.f4GEjgU68882@aslan.scsiguy.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello,

On Wed, 16 May 2001, Justin T. Gibbs wrote:

> >A short question: with AIC7xxx v6.1.13 / linux v2.4.4, is it possible to
> >limit Tagged opennings/Max command opennings per device?
> 
> You can with a LILO option.  If you search the comments about LILO
> options in drivers/scsi/aic7xxx/aic7xxx_linux.c, you'll see how
> to do it.

OK, this works nicely (even with aboot :)

> Does the driver report any messages to the console?  If so, run with
> "aic7xxx=verbose" and send me the messages.  You may need to setup a
> serial console to acurately capture the messages.

The output from different tries was different every time:

try 1 (without tag_info parameter):

   "Unable to handle kernel paging request".

   After that, system freezez. Note: indeed there's a swap partition on
   the offending device, however it's not the only one and has the lowest
   priority. The test was done on idle machine, so only some little
   swapping was done to the other scsi device. I guess the above message
   has something to do with locked SCSI bus.

try 2 (the same settings as above)

   No console output at all

try 3 (with tag_info:{{253,16}})

   No extensive testing, but system seems to work fine

   Note: the problematic device is scsi0:0:1:0, hence the tag_info
   setting.

try 4 (with tag_info:{{253,32}})

   System operates just fine

try 5 (with tag_info:{{253,62}})

   Some registers dump (some of it scrolls off the serial terminal), the
   bottom of it says:

   Trace: 8cccfc 8cc1d8 815f0c 819de4 816510 810ab8 810c64 xxxxxx 8269fc 810c64
   Kernel panic: Aiee, killing interrupt handler
   In interrupt handler - not syncing

   Note: I'm not a kernel hacker, so I don't really know what to do with
         this
   Note2: the xxxxxx value from the Trace is not really xxxxxx, it was a
         hex value I wrote down wrongly


try 6 (again with tag_info:{{253,62}})

   No output at all, just system freeze

try 7 (with tag_info:{{253,48}})

   There's a lot of activity on console after SCSI chokes. Some of it
   is like this:

.
.
.
DevQ(0:0:0): 0 waiting
DevQ(0:1:0): 0 waiting
DevQ(0:4:0): 0 waiting
Recovery SCB completes
(scsi0:A:1:0): Queuing a recovery SCB
scsi0:0:1:0: Device is disconnected, re-queuing SCB
Recovery code sleeping
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:1:0: Attempting to queue ABORT message
scsi0: PCI error Interrupt at seqaddr = 0x7e
scsi0: Received a Target Abort
scsi0: Dumping Card state in Data-out phase, at SEQADDR 0x7e
SCSISEQ = 0x12, SBLKCTL = 0x2
 DFCNTRL = 0x3c, DFSTATUS = 0x71
LASTPHASE = 0x0, SCSISIGI = 0x14, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0x3
STACK == 0x0, 0x15d, 0x18d, 0x70
SCB count = 112
Kernel NEXTQSCB = 49
Card NEXTQSCB = 60
QINFIFO entries: 60 15
Waiting Queue entries:
Disconnected Queue entries: 0:70 12:53
QOUTFIFO entries:
Sequencer Free SCB List: 6 3 8 5 10 9 13 7 11 14 2 1 4
Pending list: 15 10 8 2 70 53 60 56
Kernel Free SCB list: ...........
.
.
.


    Again, I couldn't write down all of it ...

I'm not sure it's really a problem of aic7xxx driver that this device
misbehaves. It just might be the firmware.

This hard disk has a long history here, but I guess it hasn't been under
such a stress so far. First it was used in AlphaStation 600 5/333, hooked
up to a "Q Logic ISP1020 (rev 02)", running DigitalUnix v3.2. Then it was
used in current machine, however with older Linux kernel and aic7xxx
driver of previous generation with max tagged commands set maximum to 16.

So, the problem seems to be cured by setting the maximum command openings
to something as low as 32 (I'll stress the system with this setting to
see, if it really helps).

If there are some other tests to be done, I'd be glad to help.

Regards,
  Metod

Metod Kozelj

mailto:Metod.Kozelj@rzs-hm.si            /\  Ne posiljajte mi smeti ker grizem!
http://www.rzs-hm.si/                   /  \  Don't spam me for I bite!
_______________________________________/    \__________________________________

---- perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);'



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe aic7xxx" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.HPP.3.96.1010517094603.6988A-100000>