From owner-freebsd-fs@FreeBSD.ORG Thu Aug 1 14:20:02 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id EF1229DD for ; Thu, 1 Aug 2013 14:20:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CBF8525BE for ; Thu, 1 Aug 2013 14:20:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r71EK11E099622 for ; Thu, 1 Aug 2013 14:20:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r71EK1qv099621; Thu, 1 Aug 2013 14:20:01 GMT (envelope-from gnats) Date: Thu, 1 Aug 2013 14:20:01 GMT Message-Id: <201308011420.r71EK1qv099621@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: Christopher Harrison Subject: Re: kern/177536: [zfs] zfs livelock (deadlock) with high write-to-disk load X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Christopher Harrison List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Aug 2013 14:20:02 -0000 The following reply was made to PR kern/177536; it has been noted by GNATS. From: Christopher Harrison To: Martin Birgmeier Cc: bug-followup@FreeBSD.org Subject: Re: kern/177536: [zfs] zfs livelock (deadlock) with high write-to-disk load Date: Thu, 01 Aug 2013 09:11:30 -0500 Just last night I got the system to deadlock again. I grabbed a kernel bt but I realized I do no have full debug enabled. I think our problems are to different ones. As my bt points directly to an LSI adapter or kernel driver issue. -C On 8/1/13 12:20 AM, Martin Birgmeier wrote: > It is raidz2, using gpt partitions each covering half of six 2 TB disks. > The remainder of these disks are mostly unused, with one containing a > small UFS root for booting, and the others some small partitions which > are used for various VirtualBox instances. > > Regards, > > Martin > > On 07/31/13 22:32, Christopher D. Harrison wrote: >> what raidz level are you using: raidz2, raid0, raid1 >> also what are the pool config (see below, my pool is called z): >> >> MOS Configuration: >> version: 28 >> name: 'z' >> state: 0 >> txg: 214028175 >> pool_guid: 5334451366918939808 >> hostid: 4266313884 >> hostname: 'pisces.biostat.wisc.edu' >> hole_array[0]: 3 >> vdev_children: 4 >> vdev_tree: >> type: 'root' >> id: 0 >> guid: 5334451366918939808 >> children[0]: >> type: 'raidz' >> id: 0 >> guid: 10192098197416954465 >> nparity: 2 >> metaslab_array: 34 >> metaslab_shift: 37 >> ashift: 9 >> asize: 16003083796480 >> is_log: 0 >> create_txg: 4 >> children[0]: >> type: 'disk' >> id: 0 >> guid: 5442684633350386182 >> path: '/dev/da3p1' >> phys_path: '/dev/da3p1' >> whole_disk: 1 >> DTL: 2318341 >> create_txg: 4 >> children[1]: >> type: 'disk' >> id: 1 >> guid: 4184769488943604229 >> path: '/dev/da2p1' >> phys_path: '/dev/da2p1' >> whole_disk: 1 >> DTL: 2318340 >> create_txg: 4 >> children[2]: >> type: 'disk' >> id: 2 >> guid: 6673578809124996458 >> path: '/dev/da1p1' >> phys_path: '/dev/da1p1' >> whole_disk: 1 >> DTL: 2318358 >> create_txg: 4 >> children[3]: >> type: 'disk' >> id: 3 >> guid: 14565784372994613264 >> path: '/dev/da0p1' >> phys_path: '/dev/da0p1' >> whole_disk: 1 >> DTL: 2318357 >> create_txg: 4 >> children[4]: >> type: 'disk' >> id: 4 >> guid: 17127372890035360647 >> path: '/dev/da6p1' >> phys_path: '/dev/da6p1' >> whole_disk: 1 >> DTL: 2318356 >> create_txg: 4 >> children[5]: >> type: 'disk' >> id: 5 >> guid: 5667576937780231702 >> path: '/dev/label/disk05' >> phys_path: '/dev/label/disk05' >> whole_disk: 1 >> DTL: 26389621 >> create_txg: 4 >> children[6]: >> type: 'disk' >> id: 6 >> guid: 16545540667878636920 >> path: '/dev/da5p1' >> phys_path: '/dev/da5p1' >> whole_disk: 1 >> DTL: 2318354 >> create_txg: 4 >> children[7]: >> type: 'disk' >> id: 7 >> guid: 16180883375304336701 >> path: '/dev/da4p1' >> phys_path: '/dev/da4p1' >> whole_disk: 1 >> DTL: 2318353 >> create_txg: 4 >> children[1]: >> type: 'raidz' >> id: 1 >> guid: 7000405778472969514 >> nparity: 2 >> metaslab_array: 32 >> metaslab_shift: 37 >> ashift: 9 >> asize: 16003083796480 >> is_log: 0 >> create_txg: 4 >> children[0]: >> type: 'disk' >> id: 0 >> guid: 11848332905648146086 >> path: '/dev/da10p1' >> phys_path: '/dev/da10p1' >> whole_disk: 1 >> DTL: 2318352 >> create_txg: 4 >> children[1]: >> type: 'disk' >> id: 1 >> guid: 17180048053682905420 >> path: '/dev/da9p1' >> phys_path: '/dev/da9p1' >> whole_disk: 1 >> DTL: 2318351 >> create_txg: 4 >> children[2]: >> type: 'disk' >> id: 2 >> guid: 14648362260613207535 >> path: '/dev/da8p1' >> phys_path: '/dev/da8p1' >> whole_disk: 1 >> DTL: 2318350 >> create_txg: 4 >> children[3]: >> type: 'disk' >> id: 3 >> guid: 2512073543886070959 >> path: '/dev/da7p1' >> phys_path: '/dev/da7p1' >> whole_disk: 1 >> DTL: 2318349 >> create_txg: 4 >> children[4]: >> type: 'disk' >> id: 4 >> guid: 9213113982598250625 >> path: '/dev/da14p1' >> phys_path: '/dev/da14p1' >> whole_disk: 1 >> DTL: 2318348 >> create_txg: 4 >> children[5]: >> type: 'disk' >> id: 5 >> guid: 14965308070554000358 >> path: '/dev/da13p1' >> phys_path: '/dev/da13p1' >> whole_disk: 1 >> DTL: 2318347 >> create_txg: 4 >> children[6]: >> type: 'disk' >> id: 6 >> guid: 8812761598924675500 >> path: '/dev/da12p1' >> phys_path: '/dev/da12p1' >> whole_disk: 1 >> DTL: 2318346 >> create_txg: 4 >> children[7]: >> type: 'disk' >> id: 7 >> guid: 15539718945154864246 >> path: '/dev/da11p1' >> phys_path: '/dev/da11p1' >> whole_disk: 1 >> DTL: 2318345 >> create_txg: 4 >> children[2]: >> type: 'raidz' >> id: 2 >> guid: 12675549095184402399 >> nparity: 2 >> metaslab_array: 30 >> metaslab_shift: 37 >> ashift: 9 >> asize: 16003083796480 >> is_log: 0 >> create_txg: 4 >> children[0]: >> type: 'disk' >> id: 0 >> guid: 9157198169193782917 >> path: '/dev/da16p1' >> phys_path: '/dev/da16p1' >> whole_disk: 1 >> DTL: 35652327 >> create_txg: 4 >> children[1]: >> type: 'disk' >> id: 1 >> guid: 14494767130322696735 >> path: '/dev/da17p1' >> phys_path: '/dev/da17p1' >> whole_disk: 1 >> DTL: 2318365 >> create_txg: 4 >> children[2]: >> type: 'disk' >> id: 2 >> guid: 17589735305722725561 >> path: '/dev/da18p1' >> phys_path: '/dev/da18p1' >> whole_disk: 1 >> DTL: 2318364 >> create_txg: 4 >> children[3]: >> type: 'disk' >> id: 3 >> guid: 14307956723966398718 >> path: '/dev/da19p1' >> phys_path: '/dev/da19p1' >> whole_disk: 1 >> DTL: 2318363 >> create_txg: 4 >> children[4]: >> type: 'disk' >> id: 4 >> guid: 15965579855708087029 >> path: '/dev/da20p1' >> phys_path: '/dev/da20p1' >> whole_disk: 1 >> DTL: 2318362 >> create_txg: 4 >> children[5]: >> type: 'disk' >> id: 5 >> guid: 760237787742339099 >> path: '/dev/da21p1' >> phys_path: '/dev/da21p1' >> whole_disk: 1 >> DTL: 2318361 >> create_txg: 4 >> children[6]: >> type: 'disk' >> id: 6 >> guid: 2500372020986714069 >> path: '/dev/da22p1' >> phys_path: '/dev/da22p1' >> whole_disk: 1 >> DTL: 2318360 >> create_txg: 4 >> children[7]: >> type: 'disk' >> id: 7 >> guid: 4544065732274116075 >> path: '/dev/label/disk24' >> phys_path: '/dev/label/disk24' >> whole_disk: 1 >> DTL: 5572568 >> create_txg: 4 >> children[3]: >> type: 'hole' >> id: 3 >> guid: 0 >> metaslab_array: 0 >> metaslab_shift: 17 >> ashift: 0 >> asize: 0 >> is_log: 0 >> is_hole: 1 >> >> On 07/31/13 14:56, Martin Birgmeier wrote: >>> No, this is not an LSI HBA, but rather >>> >>> atapci0: port >>> 0xcc00-0xcc07,0xc880-0xc883,0xc800-0xc807,0xc480-0xc483,0xc400-0xc40f >>> mem 0xfe8fe000-0xfe8fffff irq 18 at device 0.0 on pci3 >>> atapci1: at channel -1 on atapci0 >>> atapci1: AHCI v1.00 controller with 2 3Gbps ports, PM supported >>> ata2: on atapci1 >>> ata3: on atapci1 >>> ata4: at channel 0 on atapci0 >>> atapci2: port >>> 0xa000-0xa007,0x9000-0x9003,0x8000-0x8007,0x7000-0x7003,0x6000-0x600f >>> mem 0xfe4ffc00-0xfe4fffff irq 19 at device 17.0 on pci0 >>> atapci2: AHCI v1.20 controller with 6 3Gbps ports, PM supported >>> ata5: at channel 0 on atapci2 >>> ata6: at channel 1 on atapci2 >>> ata7: at channel 2 on atapci2 >>> ata8: at channel 3 on atapci2 >>> ata9: at channel 4 on atapci2 >>> ata10: at channel 5 on atapci2 >>> ada0 at ata5 bus 0 scbus3 target 0 lun 0 >>> ada1 at ata6 bus 0 scbus4 target 0 lun 0 >>> ada2 at ata7 bus 0 scbus5 target 0 lun 0 >>> ada3 at ata8 bus 0 scbus6 target 0 lun 0 >>> ada4 at ata9 bus 0 scbus7 target 0 lun 0 >>> ada5 at ata10 bus 0 scbus8 target 0 lun 0 >>> >>> (This is just a simple PC motherboard - ASUS M4A89GTD PRO USB3.) >>> >>> Regards, >>> >>> Martin >>> >> >>