From owner-freebsd-scsi@FreeBSD.ORG Fri Oct 28 00:27:59 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92EAE106564A for ; Fri, 28 Oct 2011 00:27:59 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from fallbackmx09.syd.optusnet.com.au (fallbackmx09.syd.optusnet.com.au [211.29.132.242]) by mx1.freebsd.org (Postfix) with ESMTP id 1BD438FC16 for ; Fri, 28 Oct 2011 00:27:58 +0000 (UTC) Received: from mail15.syd.optusnet.com.au (mail15.syd.optusnet.com.au [211.29.132.196]) by fallbackmx09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p9RMQAuw029867 for ; Fri, 28 Oct 2011 09:26:10 +1100 Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail15.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p9RMQ7ub029305 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 28 Oct 2011 09:26:08 +1100 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.5/8.14.4) with ESMTP id p9RMQ4c5040568 for ; Fri, 28 Oct 2011 09:26:04 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.5/8.14.4/Submit) id p9RMQ3mc040567 for freebsd-scsi@freebsd.org; Fri, 28 Oct 2011 09:26:03 +1100 (EST) (envelope-from peter) Date: Fri, 28 Oct 2011 09:26:03 +1100 From: Peter Jeremy To: freebsd-scsi@freebsd.org Message-ID: <20111027222603.GA39946@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="7AUc2qLy4jB3hD7Z" Content-Disposition: inline X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Watchdog timeouts on isp(4) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Oct 2011 00:27:59 -0000 --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I have a 16-CPU SunFire V890 running FreeBSD-current/sparc64 r226167. Whilst doing some stress testing (6 parallel -j16 buildworlds with regular "sysctl vm.vmtotal"), I have started seeing isp watchdog timeouts. I didn't see this with similar testing on about r223035. Does anyone have any suggestions? The messages look like: (da4:isp0:0:4:0): first watchdog (handle 0xb4d820d9) timed out- deferring f= or grace period (da4:isp0:0:4:0): first watchdog (handle 0xb4d920ba) timed out- deferring f= or grace period (da11:isp1:0:5:0): first watchdog (handle 0x5f0620c9) timed out- deferring = for grace period (da11:isp1:0:5:0): first watchdog (handle 0x5f07203a) timed out- deferring = for grace period (da11:isp1:0:5:0): first watchdog (handle 0x5f08208c) timed out- deferring = for grace period =2E.. (da11:isp1:0:5:0): first watchdog (handle 0x5f762088) timed out- deferring = for grace period isp1: isp_watchdog: timeout for handle 0x5f0620c9 (da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x20 = 0x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp1: isp_watchdog: timeout for handle 0x5f07203a (da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x40 = 0x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp1: isp_watchdog: timeout for handle 0x5f08208c (da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x60 = 0x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp1: isp_watchdog: timeout for handle 0x5f09201f (da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x80 = 0x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp1: bad request handle 0x5f0620c9 (iocb type 0x3) isp1: bad request handle 0x5f07203a (iocb type 0x3) =2E.. (da4:isp0:0:4:0): first watchdog (handle 0x5cf1206d) timed out- deferring f= or grace period (da4:isp0:0:4:0): first watchdog (handle 0x5cf2203a) timed out- deferring f= or grace period isp0: isp_watchdog: timeout for handle 0x5cad2046 (da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xdd 0xe8 0xe0 0= x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp0: bad request handle 0x5cad2046 (iocb type 0x3) isp0: isp_watchdog: timeout for handle 0x5cdb20cb (da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xe3 0xa8 0x00 0= x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp0: isp_watchdog: timeout for handle 0x5cdc2059 (da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xe3 0xa8 0x20 0= x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp0: isp_watchdog: timeout for handle 0x5cdd2020 (da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xe3 0xa8 0x40 0= x00 0x00 0x20 0x00 STS 0x0 XS_ERR=3D0xb isp0: bad request handle 0x5cdb20cb (iocb type 0x3) isp0: bad request handle 0x5cdc2059 (iocb type 0x3) isp0: bad request handle 0x5cdd2020 (iocb type 0x3) The da11 errors are unexpected because there should be no activity on filesystems on that disk. $ df -k Filesystem 1024-blocks Used Avail Capacity Mounted on /dev/da4a 9850318 3495908 5566386 39% / devfs 1 1 0 100% /dev /dev/da4e 981998 17936 885504 2% /tmp /dev/da4d 127993860 60068290 57686062 51% /var /dev/da11a 9850318 1404716 7657578 16% /8 /dev/da11e 981998 7870 895570 1% /8/tmp /dev/da11d 127993860 22228452 95525900 19% /8/var /dev/md0 28784540 1482784 24998996 6% /a $ swapinfo Device 1K-blocks Used Avail Capacity /dev/da0b 67110720 0 67110720 0% /dev/da6b 67110720 0 67110720 0% Total 134221440 0 134221440 0% The isp's and associated disks are: isp0: port 0x300-0x3ff mem 0x400000-0x4= 00fff at device 2.0 on pci1 isp0: invalid NVRAM header isp0: invalid NVRAM header isp0: bad frame length (0) from NVRAM- using 1024 isp0: bad execution throttle of 0- using 16 =2E.. isp1: port 0x1000-0x10ff mem 0x100000-0= x100fff at device 4.0 on pci4 isp2: port 0x1100-0x11ff mem 0x102000-0= x102fff at device 5.0 on pci4 =2E.. (probe6:isp0:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0=20 (probe6:isp0:0:6:0): CAM status: SCSI Status Error (probe6:isp0:0:6:0): SCSI status: Check Condition (probe6:isp0:0:6:0): SCSI sense: UNIT ATTENTION asc:29,1 (Power on occurred) (probe518:isp1:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0=20 (probe518:isp1:0:6:0): CAM status: SCSI Status Error (probe518:isp1:0:6:0): SCSI status: Check Condition (probe518:isp1:0:6:0): SCSI sense: UNIT ATTENTION asc:29,1 (Power on occurr= ed) uhub0: 4 ports with 4 removable, self powered ses0 at isp0 bus 0 scbus1 target 6 lun 0 ses0: Fixed Enclosure Services SCSI-3 device= =20 ses0: 100.000MB/s transfers WWNN 0x508002000025a7c0 WWPN 0x508002000025a7c1= PortID 0xdc ses0: SCSI-3 SES Device ses1 at isp1 bus 0 scbus2 target 6 lun 0 ses1: Fixed Enclosure Services SCSI-3 device= =20 ses1: 100.000MB/s transfers WWNN 0x508002000065a568 WWPN 0x508002000065a569= PortID 0xdc ses1: SCSI-3 SES Device da0 at isp0 bus 0 scbus1 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device=20 da0: 100.000MB/s transfers WWNN 0x20000000875017a5 WWPN 0x21000000875017a5 = PortID 0xef da0: Command Queueing enabled da0: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da1 at isp0 bus 0 scbus1 target 1 lun 0 da1: Fixed Direct Access SCSI-3 device=20 da1: 100.000MB/s transfers WWNN 0x20000000875014ba WWPN 0x21000000875014ba = PortID 0xe8 da1: Command Queueing enabled da1: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da2 at isp0 bus 0 scbus1 target 2 lun 0 da2: Fixed Direct Access SCSI-3 device=20 da2: 100.000MB/s transfers WWNN 0x2000000087501828 WWPN 0x2100000087501828 = PortID 0xe4 da2: Command Queueing enabled da2: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da3 at isp0 bus 0 scbus1 target 3 lun 0 da3: Fixed Direct Access SCSI-3 device=20 da3: 100.000MB/s transfers WWNN 0x200000008796f785 WWPN 0x210000008796f785 = PortID 0xe2 da3: Command Queueing enabled da3: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da4 at isp0 bus 0 scbus1 target 4 lun 0 da4: Fixed Direct Access SCSI-3 device=20 da4: 100.000MB/s transfers WWNN 0x2000000087501755 WWPN 0x2100000087501755 = PortID 0xe1 da4: Command Queueing enabled da4: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da5 at isp0 bus 0 scbus1 target 5 lun 0 da5: Fixed Direct Access SCSI-3 device=20 da5: 100.000MB/s transfers WWNN 0x2000000087505ebc WWPN 0x2100000087505ebc = PortID 0xe0 da5: Command Queueing enabled da5: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da6 at isp1 bus 0 scbus2 target 0 lun 0 da6: Fixed Direct Access SCSI-4 device=20 da6: 100.000MB/s transfers WWNN 0x500000e01999c6b0 WWPN 0x500000e01999c6b1 = PortID 0xef da6: Command Queueing enabled da6: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da7 at isp1 bus 0 scbus2 target 1 lun 0 da7: Fixed Direct Access SCSI-4 device=20 da7: 100.000MB/s transfers WWNN 0x500000e01365b4e0 WWPN 0x500000e01365b4e1 = PortID 0xe8 da7: Command Queueing enabled da7: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da8 at isp1 bus 0 scbus2 target 2 lun 0 da8: Fixed Direct Access SCSI-4 device=20 da8: 100.000MB/s transfers WWNN 0x500000e014422a50 WWPN 0x500000e014422a51 = PortID 0xe4 da8: Command Queueing enabled da8: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da9 at isp1 bus 0 scbus2 target 3 lun 0 da9: Fixed Direct Access SCSI-4 device=20 da9: 100.000MB/s transfers WWNN 0x500000e01998e820 WWPN 0x500000e01998e821 = PortID 0xe2 da9: Command Queueing enabled da9: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da10 at isp1 bus 0 scbus2 target 4 lun 0 da10: Fixed Direct Access SCSI-4 device=20 da10: 100.000MB/s transfers WWNN 0x500000e014397bb0 WWPN 0x500000e014397bb1= PortID 0xe1 da10: Command Queueing enabled da10: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) da11 at isp1 bus 0 scbus2 target 5 lun 0 da11: Fixed Direct Access SCSI-4 device=20 da11: 100.000MB/s transfers WWNN 0x500000e0143f8ee0 WWPN 0x500000e0143f8ee1= PortID 0xe0 da11: Command Queueing enabled da11: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C) pciconf reports: isp0@pci1:0:2:0: class=3D0x010000 card=3D0x00000000 chip=3D0x2200107= 7 rev=3D0x05 hdr=3D0x00 vendor =3D 'QLogic Corp.' device =3D 'QLA2200 64-bit Fibre Channel Adapter' class =3D mass storage subclass =3D SCSI bar [10] =3D type I/O Port, range 32, base 0x300, size 256, enabled bar [14] =3D type Memory, range 32, base 0x400000, size 4096, enabled cap 01[44] =3D powerspec 1 supports D0 D3 current D0 isp1@pci3:1:4:0: class=3D0x010000 card=3D0x40831077 chip=3D0x2200107= 7 rev=3D0x05 hdr=3D0x00 vendor =3D 'QLogic Corp.' device =3D 'QLA2200 64-bit Fibre Channel Adapter' class =3D mass storage subclass =3D SCSI bar [10] =3D type I/O Port, range 32, base 0x1000, size 256, enabled bar [14] =3D type Memory, range 32, base 0x100000, size 4096, enabled cap 01[44] =3D powerspec 1 supports D0 D3 current D0 isp2@pci3:1:5:0: class=3D0x010000 card=3D0x40831077 chip=3D0x2200107= 7 rev=3D0x05 hdr=3D0x00 vendor =3D 'QLogic Corp.' device =3D 'QLA2200 64-bit Fibre Channel Adapter' class =3D mass storage subclass =3D SCSI bar [10] =3D type I/O Port, range 32, base 0x1100, size 256, enabled bar [14] =3D type Memory, range 32, base 0x102000, size 4096, enabled cap 01[44] =3D powerspec 1 supports D0 D3 current D0 --=20 Peter Jeremy --7AUc2qLy4jB3hD7Z Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6p2nsACgkQ/opHv/APuIfIEACfWA9wU0okJROhk5ij7SJ3wgXE G00AoImKh2MUQ+6VEB2Nx7IvaDpKBRgY =rlxj -----END PGP SIGNATURE----- --7AUc2qLy4jB3hD7Z--