Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Dec 2011 09:05:20 -0800
From:      Garrett Cooper <yanegomi@gmail.com>
To:        Stefan Esser <se@freebsd.org>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: Uneven load on drives in ZFS RAIDZ1
Message-ID:  <CAGH67wQ6krJ=CRFt=Fb3TAikqKfCKekvtVnpQxkTPJgFcbMKsA@mail.gmail.com>
In-Reply-To: <4EEF488E.1030904@freebsd.org>
References:  <4EEF488E.1030904@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 19, 2011 at 6:22 AM, Stefan Esser <se@freebsd.org> wrote:
> Hi ZFS users,
>
> for quite some time I have observed an uneven distribution of load
> between drives in a 4 * 2TB RAIDZ1 pool. The following is an excerpt of
> a longer log of 10 second averages logged with gstat:
>
> dT: 10.001s =A0w: 10.000s =A0filter: ^a?da?.$
> =A0L(q) =A0ops/s =A0 =A0r/s =A0 kBps =A0 ms/r =A0 =A0w/s =A0 kBps =A0 ms/=
w =A0 %busy Name
> =A0 =A00 =A0 =A0130 =A0 =A0106 =A0 4134 =A0 =A04.5 =A0 =A0 23 =A0 1033 =
=A0 =A05.2 =A0 48.8| ada0
> =A0 =A00 =A0 =A0131 =A0 =A0111 =A0 3784 =A0 =A04.2 =A0 =A0 19 =A0 1007 =
=A0 =A04.0 =A0 47.6| ada1
> =A0 =A00 =A0 =A0 90 =A0 =A0 66 =A0 2219 =A0 =A04.5 =A0 =A0 24 =A0 1031 =
=A0 =A05.1 =A0 31.7| ada2
> =A0 =A01 =A0 =A0 81 =A0 =A0 58 =A0 2007 =A0 =A04.6 =A0 =A0 22 =A0 1023 =
=A0 =A02.3 =A0 28.1| ada3
>
> =A0L(q) =A0ops/s =A0 =A0r/s =A0 kBps =A0 ms/r =A0 =A0w/s =A0 kBps =A0 ms/=
w =A0 %busy Name
> =A0 =A01 =A0 =A0132 =A0 =A0104 =A0 4036 =A0 =A04.2 =A0 =A0 27 =A0 1129 =
=A0 =A05.3 =A0 45.2| ada0
> =A0 =A00 =A0 =A0129 =A0 =A0103 =A0 3679 =A0 =A04.5 =A0 =A0 26 =A0 1115 =
=A0 =A06.8 =A0 47.6| ada1
> =A0 =A01 =A0 =A0 91 =A0 =A0 61 =A0 2133 =A0 =A04.6 =A0 =A0 30 =A0 1129 =
=A0 =A01.9 =A0 29.6| ada2
> =A0 =A00 =A0 =A0 81 =A0 =A0 56 =A0 1985 =A0 =A04.8 =A0 =A0 24 =A0 1102 =
=A0 =A06.0 =A0 29.4| ada3
>
> =A0L(q) =A0ops/s =A0 =A0r/s =A0 kBps =A0 ms/r =A0 =A0w/s =A0 kBps =A0 ms/=
w =A0 %busy Name
> =A0 =A01 =A0 =A0148 =A0 =A0108 =A0 4084 =A0 =A05.3 =A0 =A0 39 =A0 2511 =
=A0 =A07.2 =A0 55.5| ada0
> =A0 =A01 =A0 =A0141 =A0 =A0104 =A0 3693 =A0 =A05.1 =A0 =A0 36 =A0 2505 =
=A0 10.4 =A0 54.4| ada1
> =A0 =A01 =A0 =A0102 =A0 =A0 62 =A0 2112 =A0 =A05.6 =A0 =A0 39 =A0 2508 =
=A0 =A05.5 =A0 35.4| ada2
> =A0 =A00 =A0 =A0 99 =A0 =A0 60 =A0 2064 =A0 =A06.0 =A0 =A0 39 =A0 2483 =
=A0 =A03.7 =A0 36.1| ada3

This suggests (note that I said suggests) that there might be a slight
difference in the data path speeds or physical media as someone else
suggested; look at zpool iostat -v <interval> though before making a
firm statement as to whether or not a drive is truly not performing to
your assumed spec. gstat and zpool iostat -v suggest performance
though -- they aren't the end-all-be-all for determining drive
performance.

If the latency numbers were high enough, I would suggest dd'ing out to
the individual drives (i.e. remove the drive from the RAIDZ) to see if
there's a noticeable discrepancy, as this can indicate a bad cable,
backplane, or drive; from there I would start doing the physical swap
routine and see if the issue moves with the drive or stays static with
the controller channel and/or chassis slot.

Cheers,
-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGH67wQ6krJ=CRFt=Fb3TAikqKfCKekvtVnpQxkTPJgFcbMKsA>