From owner-freebsd-current@FreeBSD.ORG Mon Dec 19 21:00:22 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BC18C106566B; Mon, 19 Dec 2011 21:00:22 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1C99C8FC14; Mon, 19 Dec 2011 21:00:22 +0000 (UTC) Received: by ghrr16 with SMTP id r16so666298ghr.13 for ; Mon, 19 Dec 2011 13:00:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=D0iiDIy7FaONMN0V90hOEylzyTF9FhBpmlAk3Ll6iyI=; b=eglpe5F63MeBru5e5Busycmq+Y/My03TA9z8RPsm6sPTsze3jmuNsWY1fFu2BKzQI+ 4D2Y5cOh0jEsFprnicDdkDMhpZc45uSfAaan0PYZMJQBj+lmFpQK8qPJqL5Qefh+yQm+ IaknPDugt/QijC8FUbUnBuV7B7UEmj4opMtwE= Received: by 10.101.115.18 with SMTP id s18mr9451444anm.40.1324328421522; Mon, 19 Dec 2011 13:00:21 -0800 (PST) Received: from kruse-111.3.ixsystems.com (drawbridge.ixsystems.com. [206.40.55.65]) by mx.google.com with ESMTPS id i67sm31914220yhm.16.2011.12.19.13.00.20 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 19 Dec 2011 13:00:20 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Garrett Cooper In-Reply-To: <4EEFA472.5020509@freebsd.org> Date: Mon, 19 Dec 2011 13:00:18 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <0AE9C160-81EC-4A41-985B-EF63E4B0CD9B@gmail.com> References: <4EEF488E.1030904@freebsd.org> <4EEFA472.5020509@freebsd.org> To: Stefan Esser X-Mailer: Apple Mail (2.1084) Cc: FreeBSD Current Subject: Re: Uneven load on drives in ZFS RAIDZ1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Dec 2011 21:00:23 -0000 On Dec 19, 2011, at 12:54 PM, Stefan Esser wrote: > Am 19.12.2011 18:05, schrieb Garrett Cooper: >> On Mon, Dec 19, 2011 at 6:22 AM, Stefan Esser wrote: >>> Hi ZFS users, >>>=20 >>> for quite some time I have observed an uneven distribution of load >>> between drives in a 4 * 2TB RAIDZ1 pool. The following is an excerpt = of >>> a longer log of 10 second averages logged with gstat: >>>=20 >>> dT: 10.001s w: 10.000s filter: ^a?da?.$ >>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >>> 0 130 106 4134 4.5 23 1033 5.2 48.8| ada0 >>> 0 131 111 3784 4.2 19 1007 4.0 47.6| ada1 >>> 0 90 66 2219 4.5 24 1031 5.1 31.7| ada2 >>> 1 81 58 2007 4.6 22 1023 2.3 28.1| ada3 >>>=20 >>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >>> 1 132 104 4036 4.2 27 1129 5.3 45.2| ada0 >>> 0 129 103 3679 4.5 26 1115 6.8 47.6| ada1 >>> 1 91 61 2133 4.6 30 1129 1.9 29.6| ada2 >>> 0 81 56 1985 4.8 24 1102 6.0 29.4| ada3 >>>=20 >>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >>> 1 148 108 4084 5.3 39 2511 7.2 55.5| ada0 >>> 1 141 104 3693 5.1 36 2505 10.4 54.4| ada1 >>> 1 102 62 2112 5.6 39 2508 5.5 35.4| ada2 >>> 0 99 60 2064 6.0 39 2483 3.7 36.1| ada3 >>=20 >> This suggests (note that I said suggests) that there might be a = slight >> difference in the data path speeds or physical media as someone else >> suggested; look at zpool iostat -v though before making a >> firm statement as to whether or not a drive is truly not performing = to >> your assumed spec. gstat and zpool iostat -v suggest performance >> though -- they aren't the end-all-be-all for determining drive >> performance. >=20 > I doubt there is a difference in the data path speeds, since all = drives > are connected to the SATA II ports of an Intel H67 chip. >=20 > The drives seem to perform equally well, just with a ratio of read > requests of 30% / 30% / 20% / 20% for ada0 .. ada3. But neither queue > length nor command latencies indicate a problem or differences in the > drives. It seems that a different number of commands is scheduled for = 2 > of the 4 drives, compared to the other 2, and that scheduling should = be > part of the ZFS code. I'm quite convinced, that neither the drives nor > the other hardware plays a role, but I'll follow the suggestion to = swap > drives between controller ports and to observe whether the increased > read load moves with the drives (indicating something on disk causes = the > anomaly) or stays with the SATA ports (indicating that lower numbered > ports see higher load). >=20 >> If the latency numbers were high enough, I would suggest dd'ing out = to >> the individual drives (i.e. remove the drive from the RAIDZ) to see = if >> there's a noticeable discrepancy, as this can indicate a bad cable, >> backplane, or drive; from there I would start doing the physical swap >> routine and see if the issue moves with the drive or stays static = with >> the controller channel and/or chassis slot. >=20 > I do not expect a hardware problem, since command latencies are very > similar over all drives, despite the higher read load on some of them. > These are more busy by exactly the factor to be expected by only the > higher command rate. >=20 > But it seems that others do not observe the asymmetric distribution of > requests, which makes me wonder whether I happen to have meta data > arranged in such a way that it is always read from ada0 or ada1, but = not > (or rarely) from ada2 or ada3. That could explain it, including the = fact > that raidz1 over other numbers of drives 8e.g. 3 or 6) apparently show = a > much more symmetric distribution of read requests. Basic question: does one set of drives vibrate differently than the = other set? -Garrett=