Date: Mon, 30 Nov 2009 18:31:49 -0600 From: "James R. Van Artsdalen" <james-freebsd-fs2@jrv.org> To: Andrew Snow <andrew@modulus.org> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS guidelines - preparing for future storage expansion Message-ID: <4B1463F5.5020403@jrv.org> In-Reply-To: <4B14495E.7050306@modulus.org> References: <2ae8edf30911300120x627e42a9ha2cf003e847d4fbd@mail.gmail.com> <4B139AEB.8060900@jrv.org> <2ae8edf30911300425g4026909bm9262f6abcf82ddcd@mail.gmail.com> <5f67a8c40911301233s46a2818at9051c4ebbacf7e25@mail.gmail.com> <4B14495E.7050306@modulus.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Andrew Snow wrote: > Currently there is no "read-ahead" for scrubbing and resilvering, so > it only talks to one disk and at a time and proceeds using only about > half the I/O capacity of your disks (or less). Read-ahead is one of > the planned features for ZFS next year. gstat sometimes shows multiple outstanding I/O requests to a drive during a scrub. All of the disks are lit up at the same time: there's no one-disk-at-a-time. I see roughly 500 MB/sec during a scrub, which is around 50% of the theoretical bandwidth of both the disk-to-HBA links and the HBA-to-system slot in my case. I hope to be able to fix both this spring and see if I can reach gigabyte-per-second levels, especially for userland reads (I've seen 420 MB/s so far). (each vdev in my case is a 2-way mirror so 500 MB/s of disk is 250 MB/s of user data) > Also, when your disks are 98% or more full and you are doing any > writes at all ZFS spends a long time looking for free blocks with an > inefficient algorithm. An improved "disk full" algorithm is also > planned for next year. As the disk approaches 100% capacity the free space list(s) become shorter, not longer. It's fragmentation, or the need to search a long time for a large block in the right area, that is likely the problem. If you can accept the block at the head of the list there is no search at all. A quick snapshot during a scrub: dT: 1.006s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 238 238 30525 4.4 0 0 0.0 34.8| ada2 0 235 235 30016 3.1 0 0 0.0 26.3| ada3 0 274 274 35104 3.7 0 0 0.0 35.4| ada4 0 277 277 35485 4.0 0 0 0.0 40.5| ada5 0 273 273 34976 2.9 0 0 0.0 29.4| ada6 4 270 270 34474 7.2 0 0 0.0 53.4| ada7 0 271 271 34722 3.2 0 0 0.0 32.4| ada8 5 270 270 34410 3.4 0 0 0.0 34.0| ada9 7 268 268 34277 5.8 0 0 0.0 43.6| ada10 0 267 267 34213 4.3 0 0 0.0 32.1| ada11 4 269 269 34468 5.7 0 0 0.0 41.6| ada12 7 268 268 34277 4.7 0 0 0.0 33.4| ada13 0 277 277 35421 5.1 0 0 0.0 36.5| ada14 4 270 270 34595 5.4 0 0 0.0 37.5| ada15 0 269 269 34468 6.3 0 0 0.0 43.9| ada16 0 275 275 35167 6.2 0 0 0.0 44.9| ada17
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B1463F5.5020403>