From owner-freebsd-bugs@FreeBSD.ORG Thu Apr 15 07:30:02 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40B94106566C for ; Thu, 15 Apr 2010 07:30:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 014AA8FC1E for ; Thu, 15 Apr 2010 07:30:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o3F7U1Va063494 for ; Thu, 15 Apr 2010 07:30:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o3F7U1Um063490; Thu, 15 Apr 2010 07:30:01 GMT (envelope-from gnats) Resent-Date: Thu, 15 Apr 2010 07:30:01 GMT Resent-Message-Id: <201004150730.o3F7U1Um063490@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Daniel Black Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5EBD31065673 for ; Thu, 15 Apr 2010 07:23:29 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 4B4E58FC19 for ; Thu, 15 Apr 2010 07:23:29 +0000 (UTC) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o3F7NTcS087095 for ; Thu, 15 Apr 2010 07:23:29 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id o3F7NT9b087094; Thu, 15 Apr 2010 07:23:29 GMT (envelope-from nobody) Message-Id: <201004150723.o3F7NT9b087094@www.freebsd.org> Date: Thu, 15 Apr 2010 07:23:29 GMT From: Daniel Black To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/145714: [pmp][siis] removed SATA device on port multiplier resets entire channel loosing all other devices (8.0-stable) X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Apr 2010 07:30:02 -0000 >Number: 145714 >Category: kern >Synopsis: [pmp][siis] removed SATA device on port multiplier resets entire channel loosing all other devices (8.0-stable) >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Apr 15 07:30:01 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Daniel Black >Release: 8.0 >Organization: OVEE >Environment: FreeBSD brm00.smartcars.in.nicta.com.au 8.0-STABLE FreeBSD 8.0-STABLE #0: Fri Apr 16 01:53:45 EST 2010 root@brm00.smartcars.in.nicta.com.au:/usr/obj/usr/src/sys/BRM amd64 cvsup of stable as of a few hours ago >Description: A SATA harddrive was physically removed from one of the ports of a Silicon Image 3726 port multiplier. The kernel log appears to be reseting the entire port multiplier loosing 4 other devices. Even after the reset the other devices do not recover. # pciconf -lvc atapci1@pci0:0:31:2: class=0x01018a card=0xb0021458 chip=0x3a208086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'SATA2(4Port2) (ICH10 Family)' class = mass storage subclass = ATA cap 01[70] = powerspec 3 supports D0 D3 current D0 cap 13[b0] = PCI Advanced Features: FLR TP none1@pci0:0:31:3: class=0x0c0500 card=0x50011458 chip=0x3a308086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'SMB controller (50011458)' class = serial bus subclass = SMBus atapci2@pci0:0:31:5: class=0x010185 card=0xb0021458 chip=0x3a268086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'SATA2(2Port2) (ICH10 Family)' class = mass storage subclass = ATA cap 01[70] = powerspec 3 supports D0 D3 current D0 cap 13[b0] = PCI Advanced Features: FLR TP siis0@pci0:5:0:0: class=0x010400 card=0x71321095 chip=0x31321095 rev=0x01 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)' device = 'PCI Express (1x) to 2 Port SATA300 (SiI 3132)' class = mass storage subclass = RAID cap 01[54] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 05[5c] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 1 legacy endpoint max data 128(1024) link x1(x1) siis1@pci0:6:0:0: class=0x010400 card=0x71321095 chip=0x31321095 rev=0x01 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)' device = 'PCI Express (1x) to 2 Port SATA300 (SiI 3132)' class = mass storage subclass = RAID cap 01[54] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 05[5c] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 1 legacy endpoint max data 128(1024) link x1(x1) atapci0@pci0:7:0:0: class=0x010185 card=0xb0001458 chip=0x2368197b rev=0x00 hdr=0x00 vendor = 'JMicron Technology Corp.' device = 'JMB368 IDE Controller' class = mass storage subclass = ATA cap 01[68] = powerspec 2 supports D0 D3 current D0 cap 10[50] = PCI-Express 1 legacy endpoint IRQ 2 max data 128(128) link x1(x1) # camcontrol devlist at scbus0 target 0 lun 0 (pass0,ada0) at scbus0 target 1 lun 0 (pass1,ada1) at scbus0 target 2 lun 0 (pass2,ada2) at scbus0 target 3 lun 0 (pass3,ada3) at scbus0 target 4 lun 0 (pass4,ada4) at scbus0 target 15 lun 0 (pass5,pmp2) at scbus3 target 0 lun 0 (pass12,ada10) at scbus3 target 1 lun 0 (pass13,ada11) at scbus3 target 2 lun 0 (pass14,ada12) at scbus3 target 3 lun 0 (pass15,ada13) at scbus3 target 4 lun 0 (pass16,ada14) at scbus3 target 15 lun 0 (pass17,pmp1) # vmstat -i interrupt total rate irq1: atkbd0 2 0 irq8: rtc 649492 127 irq14: ata0 62691 12 irq16: uhci0 siis0+ 452808 89 irq17: siis1 6932183 1365 irq18: uhci2 ehci0+ 18 0 cpu0: timer 5072836 999 irq256: re0 13586 2 cpu1: timer 5071093 999 cpu2: timer 5070729 999 cpu3: timer 5070492 999 Total 28395930 5595 dmesg: Disk ada7 was removed. ada5,6,8 Apr 16 03:53:42 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada7 offset=262144 size=8192 error=6 Apr 16 03:53:42 brm00 kernel: (ada7:siisch2:0:2:0): lost device Apr 16 03:53:42 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada7 offset=2000398319616 size=8192 error=6 Apr 16 03:53:42 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada7 offset=2000398581760 size=8192 error=6 Apr 16 03:53:52 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000 Apr 16 03:53:52 brm00 kernel: siisch2: device ready timeout Apr 16 03:53:52 brm00 kernel: siisch2: trying full port reset ... Apr 16 03:53:52 brm00 kernel: (ada9:siisch2:0: Apr 16 03:53:52 brm00 kernel: 4:0): lost device Apr 16 03:53:52 brm00 kernel: Apr 16 03:53:52 brm00 kernel: (ada8:siisch2:0:3:0): lost device Apr 16 03:53:52 brm00 kernel: (ada6:siisch2:0:1:0): lost device Apr 16 03:53:52 brm00 kernel: (ada5:siisch2:0:0:0): lost device Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada9 offset=262144 size=8192 error=6 Apr 16 03:53:52 brm00 kernel: (ada7:siisch2:0:2:0): Synchronize cache failed Apr 16 03:53:52 brm00 kernel: Apr 16 03:53:52 brm00 kernel: (ada7:siisch2:0:2:0): removing device entry Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada9 offset=2000398319616 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada9 offset=2000398581760 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada8 offset=262144 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada8 offset=2000398319616 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada8 offset=2000398581760 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada6 offset=262144 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada6 offset=2000398319616 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada6 offset=2000398581760 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada5 offset=262144 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada5 offset=2000398319616 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada5 offset=2000398581760 size=8192 error=6 Apr 16 03:53:52 brm00 root: ZFS: zpool I/O failure, zpool=tank error=6 Apr 16 03:53:52 brm00 last message repeated 6 times Apr 16 03:53:52 brm00 kernel: (pmp0:siisch2:0:15:0): lost device Apr 16 03:53:52 brm00 root: ZFS: zpool I/O failure, zpool=tank error=6 Apr 16 03:53:53 brm00 root: ZFS: vdev failure, zpool=tank type=vdev.no_replicas Apr 16 03:55:52 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000 Apr 16 03:55:52 brm00 kernel: siisch2: device ready timeout Apr 16 03:55:52 brm00 kernel: siisch2: trying full port reset ... Apr 16 03:57:30 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000 Apr 16 03:57:30 brm00 kernel: siisch2: device ready timeout Apr 16 03:57:30 brm00 kernel: siisch2: trying full port reset ... Apr 16 03:58:05 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000 Apr 16 03:58:05 brm00 kernel: siisch2: device ready timeout Apr 16 03:58:05 brm00 kernel: siisch2: trying full port reset ... Apr 16 03:59:11 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000 Apr 16 03:59:11 brm00 kernel: siisch2: device ready timeout Apr 16 03:59:11 brm00 kernel: siisch2: trying full port reset ... Apr 16 04:05:11 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000 Apr 16 04:05:11 brm00 kernel: siisch2: device ready timeout Apr 16 04:05:11 brm00 kernel: siisch2: trying full port reset ... Apr 16 04:07:40 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000 Apr 16 04:07:40 brm00 kernel: siisch2: device ready timeout Apr 16 04:07:40 brm00 kernel: siisch2: trying full port reset ... # zpool status -v (froze - truss revealed no system calls) >How-To-Repeat: install 5 disks in a port multiplier. put them in use (e.g. raidz2 configuration) remove a disk >Fix: >Release-Note: >Audit-Trail: >Unformatted: