Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 8 Aug 98 12:11 PDT
From:      wojo@veritas.com (Jack Woychowski)
To:        AIC7xxx@FreeBSD.ORG
Cc:        wojo@veritas.com
Subject:   Timeouts and Resets with 5.X.X drivers - what to do? (newbie - sorry)
Message-ID:  <m0z5EOo-00007LC@megami.veritas.com>

next in thread | raw e-mail | index | archive | help


Ok, I've been hanging back for a few weeks reading over the mail on
this list and searching around for answers, but haven't really seen
any answers that seem to address this problem. My apologies as a
newbie for probably asking something that's already being addressed; I
think I joined the mailing list just after these started. (Flames for
being lame can be addressed directly to me. :-)

Details of system: HP Vectra XA w/64M memory, running linux 2.0.35
(lately) with pre-6 patch. I've got two Adaptec AHA-294X Ultra SCSI
controllers, one wide the other narrow. (My wide one is new and
currently unused; it's part of a Firewire card. Gory details of
attached disks, etc. from /proc attached below.) Note that I also
alternate-boot this system with NT 4.0 SP3 (yeah, I know, but it's for
work :-); NT experiences no similar problems.

I've been experiencing the same problems in all linux versions since
2.0.32 (and as a result have spend most of the time running 2.0.29). I
believe this makes it a problem with the 5.X.X aic7xxx driver vs the
4.1.1 (from straight linux 2.0.29), which doesn't seem to have the same
problem.

Along with others, I've been experiencing the 'aborting due to
timeout', resetting bus, 'trying harder' problems. For example, from
dmesg (this occurred during a kernel build): 

scsi : aborting command due to timeout : pid 11051, scsi0, channel 0, id 3, lun 
0 0x0a 16 55 3b 02 00 
SCSI host 0 channel 0 reset (pid 11049) timed out - trying harder
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:2:0) Synchronous at 20.0 Mbyte/sec, offset 15.
(scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
scsi : aborting command due to timeout : pid 11051, scsi0, channel 0, id 3, lun 
0 0x0a 16 55 3b 02 00 
SCSI host 0 abort (pid 11049) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:2:0) Synchronous at 20.0 Mbyte/sec, offset 15.
(scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
SCSI host 0 abort (pid 11051) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
SCSI host 0 abort (pid 11049) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:2:0) Synchronous at 20.0 Mbyte/sec, offset 15.
(scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
SCSI host 0 channel 0 reset (pid 11051) timed out - trying harder
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:2:0) Synchronous at 20.0 Mbyte/sec, offset 15.
(scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
[..]

This seems to happen only under higher I/O loads - during fsck cycles
during boot, or during kernel builds and the like. It seems to get
cumulatively worse; for example, it "backs up" on kernel builds: the
build runs along fine for a while, then there's one reset, ten seconds
later another, two seconds later another, then they pile up on one
another and the system is effectively totally bogged down. This
problem originally showed up on only one disk which was my Linux root
(target 0 - a really old, slow disk). More recently, I moved the root
to a more modern drive (target 3) and now the problem shows up on
targets 2 and 3, so I guess it's not just my slow (<4M/sec - bleck)
drive. (Target 0 is now totally unused, but still attached.)

So, just from being on the list a few weeks, I've seen others with
this problem, or at least the same output; I just haven't seen a
diagnosis, or at least not one that seems to apply to me. I don't
believe I've got conflicting addresses or the like, or conflicting
interrupts, or such. None of the "try this" advice that has passed by
me via email has seemed applicable.

I guess the newbie silly-rabbit-trix-are-for-kids question is: are
things like tagged queueing and greater queue depths s'posed to help
this problem? I've been avoiding them (not knowing exactly what they
are or what they might do for me - yikes) but if they help, I'll try
'em. (BTW, feel free to (indignantly :-) point me to something to read
to learn, rather than waste your time explaining them to me - I'm
certainly willing to do that on my own, I just haven't found a source
of info as of yet.)

My problem is that I need to be running 2.0.34 or higher to start
playing around with a firewire driver, but the timeout/abort/reset
problems pretty much stop me (can't build the darn kernel). I guess
one solution would be to use the 4.1.1 driver, but that doesn't seem
to be supported for 2.0.30+ kernels.

So: What do I do now? Any further debuggering I could be doing? Should
I start reading source? :-) (actually, I have, and don't understand a
whole lot of it - gotta find a design paper somewhere. :-)

							-- Woof

******************************************************************************
* Jack Woychowski              \\\     _     //                 Kernel Hound *
* VERITAS Software              \\\   //\   //              wojo@veritas.com *
* 1600 Plymouth Street           \\\ // \\ // oof!!     VOICE: (650)335-8533 *
* Mountain View, California 94043 \\\/   \\/              FAX: (650)335-8050 *
* -------------------------------------------------------------------------- *
* Join me in the League For Programming Freedom.     Questions? Just ask me. *
******************************************************************************
* Yow!-Zippy-Says: I'm a GENIUS!  I want to dispute sentence                 *
* structure with SUSAN SONTAG!!                                              *
******************************************************************************

GORY SYSTEM DETAILS
-------------------

wahya:/lhome/wojo$ cat /proc/pci
PCI devices found:
  Bus  0, device  12, function  0:
    SCSI storage controller: Adaptec AIC-7881U (rev 0).
      Medium devsel.  Fast back-to-back capable.  IRQ 10.  Master
      Capable.  Latency=64.  Min Gnt=8.Max Lat=8.
      I/O at 0xfc00.
      Non-prefetchable 32 bit memory at 0xfedfb000.
  Bus  1, device   5, function  0:
    FireWire (IEEE 1394): Adaptec AIC-5800 (rev 16).
      Medium devsel.  IRQ 9.  Master Capable.  Latency=64.  
      Non-prefetchable 32 bit memory at 0xfecfec00.
  Bus  1, device   4, function  0:
    SCSI storage controller: Adaptec AIC-7881U (rev 1).
      Medium devsel.  Fast back-to-back capable.  IRQ 11.  Master
      Capable.  Latency=64.  Min Gnt=8.Max Lat=8.
      I/O at 0xec00.
      Non-prefetchable 32 bit memory at 0xfecff000.
  Bus  0, device  10, function  0:
    PCI bridge: DEC DC21152 (rev 2).
      Medium devsel.  Fast back-to-back capable.  Master Capable.
      Latency=64.  Min Gnt=4.Max Lat=2.
  Bus  0, device   6, function  0:
    VGA compatible controller: Matrox Millennium (rev 1).
      Medium devsel.  Fast back-to-back capable.  IRQ 9.  
      Non-prefetchable 32 bit memory at 0xfedfc000.
      Prefetchable 32 bit memory at 0xfe000000.
  Bus  0, device   4, function  1:
    IDE interface: Intel 82371SB PIIX3 IDE (rev 0).
      Medium devsel.  Fast back-to-back capable.  Master Capable.
      Latency=32.  
      I/O at 0x580.
  Bus  0, device   4, function  0:
    ISA bridge: Intel 82371SB PIIX3 ISA (rev 1).
      Medium devsel.  Fast back-to-back capable.  Master Capable.  No
      bursts.  
  Bus  0, device   0, function  0:
    Host bridge: Intel 82441FX Natoma (rev 2).
      Medium devsel.  Fast back-to-back capable.  Master Capable.
      Latency=32.  

----------------------------------------------------------------------

wahya:/lhome/wojo$ cat /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 5.1.0pre6/3.2.4
Compile Options:
  AIC7XXX_RESET_DELAY    : 5
  AIC7XXX_TAGGED_QUEUEING: Adapter Support Enabled
                             Check below to see which
                             devices use tagged queueing
  AIC7XXX_PAGE_ENABLE    : Enabled (This is no longer an option)
  AIC7XXX_PROC_STATS     : Enabled

Adapter Configuration:
           SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter
                           Ultra Narrow Controller
    PCI MMAPed I/O Base: 0xfedfb000
      Adaptec SCSI BIOS: Enabled
                    IRQ: 10
                   SCBs: Active 0, Max Active 2,
                         Allocated 15, HW 16, Page 255
             Interrupts: 16598
      BIOS Control Word: 0x19b6
   Adapter Control Word: 0x001b
   Extended Translation: Enabled
Disconnect Enable Flags: 0x00ff
     Ultra Enable Flags: 0x00e7
 Tag Queue Enable Flags: 0x0000
Ordered Queue Tag Flags: 0x0000
Default Tag Queue Depth: 8
    Tagged Queue By Device array for aic7xxx host instance 0:
      {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255}
    Actual queue depth per device for aic7xxx host instance 0:
      {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:
(scsi0:0:0:0)
  Device using Narrow/Sync transfers at
  20.0 MByte/sec, offset 15
    Total transfers 3 (3 read;0 written)
      blks(512) rd=5; blks(512) wr=0
        < 512 512-1K   1-2K   2-4K   4-8K  8-16K 16-32K 32-64K 64-128K >128K
 Reads:     0      1      2      0      0      0      0      0      0      0 
Writes:     0      0      0      0      0      0      0      0      0      0 

(scsi0:0:1:0)
  Device using Narrow/Async transfers.
    Total transfers 2 (2 read;0 written)
      blks(512) rd=3; blks(512) wr=0
        < 512 512-1K   1-2K   2-4K   4-8K  8-16K 16-32K 32-64K 64-128K >128K
 Reads:     0      1      1      0      0      0      0      0      0      0 
Writes:     0      0      0      0      0      0      0      0      0      0 

(scsi0:0:2:0)
  Device using Narrow/Sync transfers at
  20.0 MByte/sec, offset 15
    Total transfers 7637 (6523 read;1114 written)
      blks(512) rd=68195; blks(512) wr=5864
        < 512 512-1K   1-2K   2-4K   4-8K  8-16K 16-32K 32-64K 64-128K >128K
 Reads:     0      1   1748    526   1828   2243    146     28      3      0 
Writes:     0      0    848    143     65     31     14      8      5      0 

(scsi0:0:3:0)
  Device using Narrow/Sync transfers at
  10.0 MByte/sec, offset 15
    Total transfers 8329 (2558 read;5771 written)
      blks(512) rd=21299; blks(512) wr=14936
        < 512 512-1K   1-2K   2-4K   4-8K  8-16K 16-32K 32-64K 64-128K >128K
 Reads:     0      1   1227     46    636    629      7      7      5      0 
Writes:     0      0   4668   1068     30      1      0      0      4      0 

(scsi0:0:4:0)
  Device using Narrow/Sync transfers at
  10.0 MByte/sec, offset 15
    Total transfers 2 (2 read;0 written)
      blks(512) rd=3; blks(512) wr=0
        < 512 512-1K   1-2K   2-4K   4-8K  8-16K 16-32K 32-64K 64-128K >128K
 Reads:     0      1      1      0      0      0      0      0      0      0 
Writes:     0      0      0      0      0      0      0      0      0      0 

----------------------------------------------------------------------

wahya:/lhome/wojo$ cat /proc/scsi/aic7xxx/1
Adaptec AIC7xxx driver version: 5.1.0pre6/3.2.4
Compile Options:
  AIC7XXX_RESET_DELAY    : 5
  AIC7XXX_TAGGED_QUEUEING: Adapter Support Enabled
                             Check below to see which
                             devices use tagged queueing
  AIC7XXX_PAGE_ENABLE    : Enabled (This is no longer an option)
  AIC7XXX_PROC_STATS     : Enabled

Adapter Configuration:
           SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter
                           Wide Controller
    PCI MMAPed I/O Base: 0xfecff000
      Adaptec SCSI BIOS: Enabled
                    IRQ: 11
                   SCBs: Active 0, Max Active 1,
                         Allocated 15, HW 16, Page 255
             Interrupts: 30
      BIOS Control Word: 0x18b6
   Adapter Control Word: 0x005d
   Extended Translation: Enabled
Disconnect Enable Flags: 0xffff
 Tag Queue Enable Flags: 0x0000
Ordered Queue Tag Flags: 0x0000
Default Tag Queue Depth: 8
    Tagged Queue By Device array for aic7xxx host instance 1:
      {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255}
    Actual queue depth per device for aic7xxx host instance 1:
      {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

----------------------------------------------------------------------

wahya:/lhome/wojo$ cat /proc/scsi/scsi
Attached devices: 
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: QUANTUM  Model: VIKING 2.3 NSE   Rev: 8808
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: CDC      Model: 94181-15         Rev: 0293
  Type:   Direct-Access                    ANSI SCSI revision: 01 CCS
Host: scsi0 Channel: 00 Id: 02 Lun: 00
  Vendor: SEAGATE  Model: ST32171N         Rev: 0338
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 03 Lun: 00
  Vendor: HP       Model: C3324A           Rev: 5020
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: iomega   Model: jaz 1GB          Rev: J.83
  Type:   Direct-Access                    ANSI SCSI revision: 02


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe aic7xxx" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0z5EOo-00007LC>