From owner-freebsd-scsi Mon Dec 27 3:10:34 1999 Delivered-To: freebsd-scsi@freebsd.org Received: from ns1.aha.ru (ns1.aha.ru [195.2.80.142]) by hub.freebsd.org (Postfix) with ESMTP id 1B75814EEF for ; Mon, 27 Dec 1999 03:10:28 -0800 (PST) (envelope-from abb@zenon.net) Received: from pb.hq.zenon.net (pb.zenon.net [195.2.64.18]) by ns1.aha.ru (8.9.3/8.9.3/aha-r/0.04B) with ESMTP id OAA00208; Mon, 27 Dec 1999 14:10:23 +0300 (MSK) Received: from zenon.net (abb.hq.zenon.net [192.168.9.25]) by pb.hq.zenon.net (8.9.3/8.9.3) with ESMTP id OAA89852; Mon, 27 Dec 1999 14:10:22 +0300 (MSK) Message-ID: <386749C0.34EBADE5@zenon.net> Date: Mon, 27 Dec 1999 14:13:04 +0300 From: Alexander Bezroutchko X-Mailer: Mozilla 4.6 [en] (X11; I; FreeBSD 3.3-RELEASE i386) X-Accept-Language: en MIME-Version: 1.0 To: Tom Cc: scsi@freebsd.org Subject: Re: IFT3102 and FreeBSD 3.4-STABLE troubles References: Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Tom wrote: > Well, I'm testing a single IFT-3102U2G on a dual-PIII under 3.4 stable. > I'm assuming that the take-over by the redundant controller is similar > in appearance to the host, as resetting a controller. Yes, but take-over takes 4 sec, and resetting takes about 1 minute. > I've done a few resets of the IFT controller until full load (three > instances of postmark). FreeBSD paused, then printed a bunch of errors > and then continued. I have had a incident where FreeBSD just hung after > resetting the controller. I couldn't reproduce it though. I've done a lot of resets and all of them lead to crash or inoperability ;(. For example, take a look on following console snapshots: Example N1: ~~~~~~~~~~~ Host has no local storage, ahc0 connected to IFT (swap resides on da0b): -- before controller reset --- box2# dmesg ... ahc0: rev 0x00 int a irq 19 on pci0.12.0 ahc0: aic7896/97 Wide Channel A, SCSI Id=7, 16/255 SCBs ahc1: rev 0x00 int a irq 19 on pci0.12.1 ahc1: aic7896/97 Wide Channel B, SCSI Id=7, 16/255 SCBs ... SMP: AP CPU #1 Launched! da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da0: 5120MB (10485760 512 byte sectors: 255H 63S/T 652C) da1 at ahc0 bus 0 target 0 lun 1 da1: Fixed Direct Access SCSI-2 device da1: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da1: 10000MB (20480000 512 byte sectors: 255H 63S/T 1274C) da2 at ahc0 bus 0 target 0 lun 2 da2: Fixed Direct Access SCSI-2 device da2: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da2: 5120MB (10485760 512 byte sectors: 255H 63S/T 652C) da3 at ahc0 bus 0 target 0 lun 3 da3: Fixed Direct Access SCSI-2 device da3: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da3: 10000MB (20480000 512 byte sectors: 255H 63S/T 1274C) changing root device to da0s1a ... box2# mount /dev/da0s1a on / (ufs, local, writes: sync 7 async 62) /dev/da0s1e on /var (ufs, local, writes: sync 100 async 150) /dev/da1a on /usr/obj (ufs, local, writes: sync 2 async 0) procfs on /proc (procfs, local) --- then, I run 'find' and reset controller --- box2# find / > /dev/null (da0:ahc0:0:0:0): Invalidating pack (da0:ahc0:0:0:0): Invalidating pack Dec 27 12:48:03 box2 /kernel: (da0:ahc0:0:0:0): Invalidating pack Dec 27 12:48:03 box2 /kernel: (da0:ahc0:0:0:0): Invalidating pack Dec 27 12:48:35 box2 /kernel: (da0:ahc0:0:0:0): Invalidating pack Dec 27 12:48:35 box2 /kernel: (da0:ahc0:0:0:0): Invalidating pack zsh: device not configured: /var/mail/root box2# --- here controller is up, but freebsd is broken box2# ls spec_getpages: I/O read failure: (error code=6) size: 65536, resid: 65536, a_count: 65536, valid: 0x0 nread: 0, reqpage: 0, pindex: 0, pcount: 16 spec_getpages: I/O read failure: (error code=6) size: 65536, resid: 65536, a_count: 65536, valid: 0x0 nread: 0, reqpage: 0, pindex: 0, pcount: 16 Dec 27 12:51:54 box2 /kernel: spec_getpages: I/O read failure: (error code=6) zsh: device not configured: ls Dec 27 12:51:54 box2 /kernel: spec_getpages: I/O read failure: (error code=6) Dec 27 12:51:54 box2 /kernel: size: 65536, resid: 65536, a_count: 65536, valid: 0x0 Dec 27 12:51:54 box2 /kernel: size: 65536, resid: 65536, a_count: 65536, valid: 0x0 Dec 27 12:51:54 box2 /kernel: nread: 0, reqpage: 0, pindex: 0, pcount: 16 zsh: device not configured: /var/mail/root Dec 27 12:51:54 box2 /kernel: nread: 0, reqpage: 0, pindex: 0, pcount: 16 Dec 27 12:51:54 box2 /kernel: spec_getpages: I/O read failure: (error code=6) Dec 27 12:51:54 box2 /kernel: spec_getpages: I/O read failure: (error code=6) Dec 27 12:51:54 box2 /kernel: size: 65536, resid: 65536, a_count: 65536, valid: 0x0 Dec 27 12:51:54 box2 /kernel: size: 65536, resid: 65536, a_count: 65536, valid: 0x0 Dec 27 12:51:54 box2 /kernel: nread: 0, reqpage: 0, pindex: 0, pcount: 16 Dec 27 12:51:54 box2 /kernel: nread: 0, reqpage: 0, pindex: 0, pcount: 16 box2# ---------------------------------------------------------------- Example N2: ~~~~~~~~~~~ Host has one local disk (da4) connected to ahc1, ach0 connected to IFT (swap resides on da4b). --- before controller reset --- box1# dmesg ... ahc0: rev 0x00 int a irq 19 on pci0.12.0 ahc0: aic7896/97 Wide Channel A, SCSI Id=7, 16/255 SCBs ahc1: rev 0x00 int a irq 19 on pci0.12.1 ahc1: aic7896/97 Wide Channel B, SCSI Id=7, 16/255 SCBs SMP: AP CPU #1 Launched! ... da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da0: 5120MB (10485760 512 byte sectors: 255H 63S/T 652C) da4 at ahc1 bus 0 target 0 lun 0 da4: Fixed Direct Access SCSI-2 device da4: 10.000MB/s transfers (5.000MHz, offset 15, 16bit), Tagged Queueing Enabled da4: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) da1 at ahc0 bus 0 target 0 lun 1 da1: Fixed Direct Access SCSI-2 device da1: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da1: 10000MB (20480000 512 byte sectors: 255H 63S/T 1274C) da2 at ahc0 bus 0 target 0 lun 2 da2: Fixed Direct Access SCSI-2 device da2: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da2: 5120MB (10485760 512 byte sectors: 255H 63S/T 652C) da3 at ahc0 bus 0 target 0 lun 3 da3: Fixed Direct Access SCSI-2 device da3: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da3: 10000MB (20480000 512 byte sectors: 255H 63S/T 1274C) changing root device to da4s4a changing root device to da4a ... box1# mount /dev/da4a on / (ufs, local, writes: sync 135 async 294) procfs on /proc (procfs, local) /dev/da0a on /mnt (ufs, local, writes: sync 2 async 1) ... --- then I run 'find' and reset controller --- box1# find /mnt -ls > /dev/null (da0:ahc0:0:0:0): Invalidating pack Dec 27 13:37:21 box1 /kernel: (da0:ahc0:0:0:0): Invalidating pack Dec 27 13:37:21 box1 /kernel: (da0:ahc0:0:0:0): Invalidating pack (da0:ahc0:0:0:0): Invalidating pack (da0:ahc0:0:0:0): Invalidating pack (da0:ahc0:0:0:0): Invalidating pack (da0:ahc0:0:0:0): Invalidating pack Dec 27 13:37:33 box1 last message repeated 4 times Dec 27 13:37:33 box1 last message repeated 4 times find: /mnt/usr/home: Device not configured find: /mnt/usr/obj: Device not configured find: /mnt/usr/games: Device not configured find: /mnt/usr/ports: Device not configured find: sys: Device not configured find: home: Device not configured box1# --- here controller is up, but freebsd is broken box1# ls -la /mnt box1# umount /mnt umount: unmount of /mnt failed: Device not configured box1# ---------------------------------------------------------------- Unfortunately, I have never seen correct behaviour of FreeBSD after controller reset occured during any activity on filesystem mounted from IFT. I have seen about 20 messages related to IFT scsi-to-scsi controllers in this list. Did anybody investigate behaviour of FreeBSD during controller take-over and reset ? SY, Alexander Bezroutchko To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message