From owner-freebsd-stable@FreeBSD.ORG Fri Oct 8 09:39:00 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E26A106566B for ; Fri, 8 Oct 2010 09:39:00 +0000 (UTC) (envelope-from jhay@meraka.csir.co.za) Received: from zibbi.meraka.csir.co.za (unknown [IPv6:2001:4200:7000:2::1]) by mx1.freebsd.org (Postfix) with ESMTP id 92FB68FC13 for ; Fri, 8 Oct 2010 09:38:59 +0000 (UTC) Received: by zibbi.meraka.csir.co.za (Postfix, from userid 3973) id 6718C39822; Fri, 8 Oct 2010 11:38:57 +0200 (SAST) Date: Fri, 8 Oct 2010 11:38:57 +0200 From: John Hay To: Andriy Gapon Message-ID: <20101008093857.GA78363@zibbi.meraka.csir.co.za> References: <20101007121558.GA70199@zibbi.meraka.csir.co.za> <20101007155042.GA88362@zibbi.meraka.csir.co.za> <20101007173102.GA95405@zibbi.meraka.csir.co.za> <4CAE1146.3030305@icyb.net.ua> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CAE1146.3030305@icyb.net.ua> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org Subject: Re: zfs hang in zio->io_cv) with dd read X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Oct 2010 09:39:00 -0000 On Thu, Oct 07, 2010 at 09:28:22PM +0300, Andriy Gapon wrote: > on 07/10/2010 20:31 John Hay said the following: > > Oct 7 17:11:49 thumper1 kernel: mvsch23: EMPTY CRPB 30 (->0) 0 4000 > > Can you rule out hardware (or driver-level) problems? > E.g. by dd-ing to/from disk directly. > Doing that in parallel on the same and/or different disks. > Running any disk I/O benchmarks. Well, it might not be conclusive, but here is what I have done/tried: dd from a few select disks. They all do about 64MB/s and 900 interrupts per second. No kernel messages in dmesg or /var/log/messages. Typical command is: dd if=/dev/ada17 of=/dev/null bs=64k count=80000 8 simultaneous dds from the 8 disks on a controller. I still get 64MB/s and 7000+ interrupts per second. No kernel messages. 6 simultaneous dds from a disk on each of the 6 controllers. I still get 64MB/s and 900+ interrupts per second per controller. No kernel messages. I made a small zfs raidz2 with 6 disks, one from each controller. dd to and from it with no problem. I made a small zfs raidz2 with 8 disks, all from one controller. dd to and from it at 190MB/s and 270MB/s, no problem. Bonnie++ finished without a problem. Next I made a zpool with 2 X raidz2 with 8 disks each. Each raidz2 on its own controller: zpool create -m none tst \ raidz2 ada0p1 ada1p1 ada2p1 ada3p1 ada4p1 ada5p1 ada6p1 ada7p1 \ raidz2 ada8p1 ada9p1 ada10p1 ada11p1 ada12p1 ada13p1 ada14p1 ada15p1 Creating a file with dd finished without a problem, about 245MB/s. # dd if=/dev/zero of=/export/tst.dd bs=64k count=160000 160000+0 records in 160000+0 records out 10485760000 bytes transferred in 42.732294 secs (245382567 bytes/sec) Reading from the file caused a hang again: # dd of=/dev/null if=/export/tst.dd bs=64k This message arrived in dmesg: mvsch15: EMPTY CRPB 13 (->14) 0 0000 And a little later there was a lot more: mvsch15: Timeout on slot 1 mvsch15: iec 02000000 sstat 00000123 serr 00000000 edma_s 00001100 dma_c 00000000 dma_s 00000000 rs 00000002 status 50 mvsch2: EMPTY CRPB 16 (->0) 2 4000 mvsch2: EMPTY CRPB 18 (->0) 1 4000 mvsch2: EMPTY CRPB 19 (->0) 2 4000 mvsch2: EMPTY CRPB 20 (->0) 3 4000 mvsch2: EMPTY CRPB 21 (->0) 0 4000 mvsch2: EMPTY CRPB 22 (->0) 1 4000 mvsch2: EMPTY CRPB 23 (->0) 2 4000 ... While this was happening, a dd from ada7p1 ran at normal speed, but from ada15p1 (which is on mvsch15) hanged for a while until there was a burst of mvsX interrupts and then finished without a further hickup. The original dd from tst.dd still have not finished. So it might be a driver problem, which only occur when pushed in a different than I could with my simultaneous dds to the raw partitions. If there are more tests that I can do, just say what. If someone wants a login to debug this, I can do it. John -- John Hay -- jhay@meraka.csir.co.za / jhay@FreeBSD.org