Date: Wed, 21 Sep 2005 21:41:28 -0000 From: "buffalo6262" <buffalo@radix.net> To: aic7xxx@freebsd.org Subject: Hot Swapping SCA Drives Generates Kernel Errors With Onboard AIC-7902 Message-ID: <dgsk28%2Bknqc@eGroups.com>
next in thread | raw e-mail | index | archive | help
Greetings, I have a Supermicro 6023-P8R 2U rackmount Xeon server, with an onboard, dual channel AIC 7092 controller, running RedHat Enterprise 4.0 ES. AIC-7902 driver version 1.3.11. This box has six internal SCA drive slots which are fully populated with Seagate U320 Drives (two ST336607LC and four ST336754LC, all have the latest firmware version). All of these are serviced by channel A of the controller. Next, I have a Storcase Infostation 14 bay SCA drive case, populated with nine Seagate ST3146807LC U320 drives (all have the latest firmware version). The infostation 14 has dual backplanes which each serve seven SCA slots. These two backplanes can be joined into one with a cable (which I've done). The Infostation and it's drive bays are serviced by channel B of the AIC-7902 controller. The driver seems to see everything OK at boot time: ================================================== SCSI subsystem initialized ACPI: PCI interrupt 0000:06:02.0[A] -> GSI 76 (level, low) -> IRQ 217 ACPI: PCI interrupt 0000:06:02.1[B] -> GSI 77 (level, low) -> IRQ 225 scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs (scsi0:A:0): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi0:A:1): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi0:A:2): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi0:A:3): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi0:A:4): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi0:A:5): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) Vendor: SEAGATE Model: ST336607LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:0:0: Tagged Queuing enabled. Depth 4 SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB) SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 Vendor: SEAGATE Model: ST336607LC Rev: 0007scsi0:A:1:0: Tagged Queuing enabled. Depth 4 SCSI device sdb: 71687372 512-byte hdwr sectors (36704 MB) SCSI device sdb: drive cache: write back sdb: sdb1 sdb2 sdb3 Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0 Vendor: SEAGATE Model: ST336754LC Rev: 0003 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:2:0: Tagged Queuing enabled. Depth 4 SCSI device sdc: 71687372 512-byte hdwr sectors (36704 MB) SCSI device sdc: drive cache: write back sdc: sdc1 Attached scsi disk sdc at scsi0, channel 0, id 2, lun 0 Vendor: SEAGATE Model: ST336754LC Rev: 0003 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:3:0: Tagged Queuing enabled. Depth 4 SCSI device sdd: 71687372 512-byte hdwr sectors (36704 MB) SCSI device sdd: drive cache: write back sdd: sdd1 Attached scsi disk sdd at scsi0, channel 0, id 3, lun 0 Vendor: SEAGATE Model: ST336754LC Rev: 0003 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:4:0: Tagged Queuing enabled. Depth 4 SCSI device sde: 71687372 512-byte hdwr sectors (36704 MB) SCSI device sde: drive cache: write back sde: sde1 Attached scsi disk sde at scsi0, channel 0, id 4, lun 0 Vendor: SEAGATE Model: ST336754LC Rev: 0003 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:5:0: Tagged Queuing enabled. Depth 4 SCSI device sdf: 71687372 512-byte hdwr sectors (36704 MB) SCSI device sdf: drive cache: write back sdf: sdf1 Attached scsi disk sdf at scsi0, channel 0, id 5, lun 0 Vendor: SUPER Model: GEM318 Rev: 0 Type: Processor ANSI SCSI revision: 02 scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs (scsi1:A:0): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:1): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:2): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:3): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:4): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:5): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:6): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:8): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) (scsi1:A:9): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:0:0: Tagged Queuing enabled. Depth 4 SCSI device sdg: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdg: drive cache: write back sdg: sdg1 Attached scsi disk sdg at scsi1, channel 0, id 0, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:1:0: Tagged Queuing enabled. Depth 4 SCSI device sdh: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdh: drive cache: write back sdh: sdh1 Attached scsi disk sdh at scsi1, channel 0, id 1, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:2:0: Tagged Queuing enabled. Depth 4 SCSI device sdi: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdi: drive cache: write back sdi: sdi1 Attached scsi disk sdi at scsi1, channel 0, id 2, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:3:0: Tagged Queuing enabled. Depth 4 SCSI device sdj: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdj: drive cache: write back sdj: sdj1 Attached scsi disk sdj at scsi1, channel 0, id 3, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:4:0: Tagged Queuing enabled. Depth 4 SCSI device sdk: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdk: drive cache: write back sdk: sdk1 Attached scsi disk sdk at scsi1, channel 0, id 4, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:5:0: Tagged Queuing enabled. Depth 4 SCSI device sdl: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdl: drive cache: write back sdl: sdl1 Attached scsi disk sdl at scsi1, channel 0, id 5, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:6:0: Tagged Queuing enabled. Depth 4 SCSI device sdm: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdm: drive cache: write back sdm: sdm1scsi1:A:8:0: Tagged Queuing enabled. Depth 4 SCSI device sdn: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdn: drive cache: write back sdn: sdn1 Attached scsi disk sdn at scsi1, channel 0, id 8, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:9:0: Tagged Queuing enabled. Depth 4 SCSI device sdo: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdo: drive cache: write back sdo: sdo1 Attached scsi disk sdo at scsi1, channel 0, id 9, lun 0 Attached scsi disk sdm at scsi1, channel 0, id 6, lun 0 Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 ================================================================== While trying to build a software RAID array on channel B, I've run into problem when trying to run commands adding and deleting drives from the kernel, such as: echo "scsi remove-single-device 1 0 8 0" > /proc/scsi/scsi Normally, I get the expected response from the kernel, like: ========================================== Synchronizing SCSI cache for disk sdo: (scsi1:A:9): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) Vendor: SEAGATE Model: ST3146807LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 scsi1:A:9:0: Tagged Queuing enabled. Depth 4 SCSI device sdo: 286749488 512-byte hdwr sectors (146816 MB) SCSI device sdo: drive cache: write back sdo: sdo1 Attached scsi disk sdo at scsi1, channel 0, id 9, lun 0 Attached scsi generic sg15 at scsi1, channel 0, id 9, lun 0, type 0 =========================================== But on occasion (and this is after giving the drives a full minute to spool up and stabilize after plugging them into their slots), I'm getting messages like: =========================================== scsi1: ILLEGAL_PHASE 0x80 scsi1:A:8:0: Attempt to issue message failed (scsi1:A:8): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) scsi1: ILLEGAL_PHASE 0x80 scsi1:A:9:0: Attempt to issue message failed (scsi1:A:9): 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit) ============================================ Sometimes repeating the "scsi-single-add" command a second or third time (when the first attempt has errored out) will cause the kernel to add the drive and then all is well. But in several instances, additional "scsi-single-add" commands have resulted in the driver dumping the controller card state, trying to reset all devices and the bus, timeouts, kicking out devices that have been hot-added into the kernel, and so on. When this happens, a reboot is required to stabilize everything. I've run Seagate's "seatools" utility to test all the drives and make sure that none of them has an obvious problem (they all test OK and SMART attributes all look OK). Any ideas as to what might be provoking all this? TIA for any and all responses/pointers/suggestions, etc.. --Duncan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?dgsk28%2Bknqc>