From owner-freebsd-current@FreeBSD.ORG Mon Dec 19 20:54:14 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9B0F21065677 for ; Mon, 19 Dec 2011 20:54:14 +0000 (UTC) (envelope-from se@freebsd.org) Received: from nm19-vm0.bullet.mail.bf1.yahoo.com (nm19-vm0.bullet.mail.bf1.yahoo.com [98.139.213.162]) by mx1.freebsd.org (Postfix) with SMTP id 2B0048FC1D for ; Mon, 19 Dec 2011 20:54:13 +0000 (UTC) Received: from [98.139.212.153] by nm19.bullet.mail.bf1.yahoo.com with NNFMP; 19 Dec 2011 20:54:13 -0000 Received: from [98.139.211.204] by tm10.bullet.mail.bf1.yahoo.com with NNFMP; 19 Dec 2011 20:54:13 -0000 Received: from [127.0.0.1] by smtp213.mail.bf1.yahoo.com with NNFMP; 19 Dec 2011 20:54:13 -0000 X-Yahoo-Newman-Id: 315247.55574.bm@smtp213.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: DaM2Lv8VM1n2fdLoMLztGDCUd8eSMhPuUgKfFxRkjhC40T3 .3pQwD0uaQ0tuSWKvC8PvPcWiXbFws0sTi5PnKfgXleACG5lmuWEHmEBgCKL 7Wk_MheoC3Wa.VGxu7pkcqTs.9dnWoxIYphCiQb6_BNXr0q_mVljezJPdjd1 rbtc0zdO4UBLCsYQS2K4sG5qJcwJCnND7O5Cxuxty_e2XZ08SlunYaS6ORjW RoczNOqXoKkKDLqVDtbviZqrWnlkBlRp7q3IlAFdNsJeoPh01zrrdZRji69e 9kJIzMlr2XJWamWm4w1FdQcb5aK0D3C0I4SWPpxFmzQJrfJgUzXtWvYDt9HE RumuBknZu6vLXnV3sltmE3wNuS8sSos6chZJ8KcjGc0OyV0oYCOvw8qbnt1v m3RmcdW6m0WtSEPww X-Yahoo-SMTP: iDf2N9.swBDAhYEh7VHfpgq0lnq. Received: from [192.168.119.20] (se@81.173.146.234 with plain) by smtp213.mail.bf1.yahoo.com with SMTP; 19 Dec 2011 12:54:12 -0800 PST Message-ID: <4EEFA472.5020509@freebsd.org> Date: Mon, 19 Dec 2011 21:54:10 +0100 From: Stefan Esser User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: Garrett Cooper References: <4EEF488E.1030904@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD Current Subject: Re: Uneven load on drives in ZFS RAIDZ1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Dec 2011 20:54:14 -0000 Am 19.12.2011 18:05, schrieb Garrett Cooper: > On Mon, Dec 19, 2011 at 6:22 AM, Stefan Esser wrote: >> Hi ZFS users, >> >> for quite some time I have observed an uneven distribution of load >> between drives in a 4 * 2TB RAIDZ1 pool. The following is an excerpt of >> a longer log of 10 second averages logged with gstat: >> >> dT: 10.001s w: 10.000s filter: ^a?da?.$ >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >> 0 130 106 4134 4.5 23 1033 5.2 48.8| ada0 >> 0 131 111 3784 4.2 19 1007 4.0 47.6| ada1 >> 0 90 66 2219 4.5 24 1031 5.1 31.7| ada2 >> 1 81 58 2007 4.6 22 1023 2.3 28.1| ada3 >> >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >> 1 132 104 4036 4.2 27 1129 5.3 45.2| ada0 >> 0 129 103 3679 4.5 26 1115 6.8 47.6| ada1 >> 1 91 61 2133 4.6 30 1129 1.9 29.6| ada2 >> 0 81 56 1985 4.8 24 1102 6.0 29.4| ada3 >> >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >> 1 148 108 4084 5.3 39 2511 7.2 55.5| ada0 >> 1 141 104 3693 5.1 36 2505 10.4 54.4| ada1 >> 1 102 62 2112 5.6 39 2508 5.5 35.4| ada2 >> 0 99 60 2064 6.0 39 2483 3.7 36.1| ada3 > > This suggests (note that I said suggests) that there might be a slight > difference in the data path speeds or physical media as someone else > suggested; look at zpool iostat -v though before making a > firm statement as to whether or not a drive is truly not performing to > your assumed spec. gstat and zpool iostat -v suggest performance > though -- they aren't the end-all-be-all for determining drive > performance. I doubt there is a difference in the data path speeds, since all drives are connected to the SATA II ports of an Intel H67 chip. The drives seem to perform equally well, just with a ratio of read requests of 30% / 30% / 20% / 20% for ada0 .. ada3. But neither queue length nor command latencies indicate a problem or differences in the drives. It seems that a different number of commands is scheduled for 2 of the 4 drives, compared to the other 2, and that scheduling should be part of the ZFS code. I'm quite convinced, that neither the drives nor the other hardware plays a role, but I'll follow the suggestion to swap drives between controller ports and to observe whether the increased read load moves with the drives (indicating something on disk causes the anomaly) or stays with the SATA ports (indicating that lower numbered ports see higher load). > If the latency numbers were high enough, I would suggest dd'ing out to > the individual drives (i.e. remove the drive from the RAIDZ) to see if > there's a noticeable discrepancy, as this can indicate a bad cable, > backplane, or drive; from there I would start doing the physical swap > routine and see if the issue moves with the drive or stays static with > the controller channel and/or chassis slot. I do not expect a hardware problem, since command latencies are very similar over all drives, despite the higher read load on some of them. These are more busy by exactly the factor to be expected by only the higher command rate. But it seems that others do not observe the asymmetric distribution of requests, which makes me wonder whether I happen to have meta data arranged in such a way that it is always read from ada0 or ada1, but not (or rarely) from ada2 or ada3. That could explain it, including the fact that raidz1 over other numbers of drives 8e.g. 3 or 6) apparently show a much more symmetric distribution of read requests. Regards, STefan