Date: Tue, 16 Jul 2013 17:23:37 +0300 From: Daniel Kalchev <daniel@digsys.bg> To: Ivailo Tanusheff <Ivailo.Tanusheff@skrill.com> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: ZFS vdev I/O questions Message-ID: <51E55769.4030207@digsys.bg> In-Reply-To: <9d3cf0be165d4351acc5e757de3868ec@DB3PR07MB059.eurprd07.prod.outlook.com> References: <51E5316B.9070201@digsys.bg> <20130716115305.GA40918@mwi1.coffeenet.org> <51E54799.8070700@digsys.bg> <9d3cf0be165d4351acc5e757de3868ec@DB3PR07MB059.eurprd07.prod.outlook.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 16.07.13 17:09, Ivailo Tanusheff wrote: > Isn't this some kind of pool fragmentation? Because this is usually the case in such slow parts of the disk systems. I think your pool is getting full and it is heavily fragmented, that's why you have more data for each request on a different vdev. The pool may be fragmented. But not because it is full. It is fragmented because I forgot to add an ZIL when creating the pool, then proceeded to heavily use dedup and even some compression. Now, I am rewriting the pool's data and hopefully metadata, in userland, for the lack of better technology, primarily by doing zfs send/receive of various datasets then removing the originals. That helps me both balance the data across all vdevs as well as get rid of dedup and compression (that go to other pools with less deletes). My guess is this is more specifically metadata fragmentation. But fragmentation does not fully explain why the writes are so irregular -- writes should be grouped easily, especially metadata rewrites... and what is ZFS doing while not reading or writing (many seconds)? Morale: always add an ZIL to an ZFS pool, as this will save you to deal with fragmentation later. Depending on the pool usage, even an normal drive could do. Writes to the ZIL are sequential. > But this has nothing to do with the single, slow device :( > That drive is slow only when doing lots of small I/O. For bulk writes (which ZFS should be doing anyway with the kind of data this pool holds), it is actually faster than the Hitachi's. It will eventually get replaced soon. Daniel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51E55769.4030207>