Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Oct 2011 16:59:54 -0700
From:      Artem Belevich <art@freebsd.org>
To:        Dennis Glatting <freebsd@penx.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS/compression/performance
Message-ID:  <CAFqOu6juMVicHUOtQDBQxzjpAcvuRFTfV%2BtiOx5BSnOoshOUZA@mail.gmail.com>
In-Reply-To: <alpine.BSF.2.00.1110111710210.12895@Elmer.dco.penx.com>
References:  <alpine.BSF.2.00.1110111710210.12895@Elmer.dco.penx.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Oct 11, 2011 at 4:25 PM, Dennis Glatting <freebsd@penx.com> wrote:
> I would appreciate someone knowledgeable in ZFS point me in the right
> direction.
>
> I have several ZFS arrays, some using gzip for compression. The compresse=
d
> arrays hold very large text documents (10MB->20TB) and are highly
> compressible. Reading the files from a compressed data sets is fast with
> little load. However, writing to the compressed data sets incurs substant=
ial
> load on the order of a load average from 12 to 20.
>
> My questions are:
>
> 1) Why such a heavy load on writing?

gzip compression is relatively slow, even gzip-1, even on the fast
CPU, even with multiple cores. ZFS does per-block compression and
spreads the load across multiple worker threads. Compression seems to
happen when the data is being flushed to disk. There are multiple
worker threads.

What typically happens is that the data you write gets accumulated in
ARC. Every 10 seconds (ZFSv28 default, I believe. Used to be 30
before) ZFS starts flushing whatever has been accumulated. Setting
compression to its lowest setting (gzip-1) will help a bit. Getting
fastest CPU(s) you can afford will help, too, because you will be hard
pressed to compress data fast enough to saturate single HDD bandwidth,
never mind multi-disk pool. Another option is to switch to lzjb
compression. Compression level will be limited to ~2x, but it's pretty
fast.


> 2) What kind =A0of limiters can I put into effect to reduce load
> =A0 without impacting compressibilty? For example, is there some
> =A0 variable to controls the number of parallel compression
> =A0 operations?

If on average you write data faster than yoour CPU can compress it
with a chosen compression settings, there's not much you can do.

If CPU can keep up with writes in general, then there are few things
you can to to prevent compression rush.

Tinkering with following tunables may help:
vfs.zfs.txg.timeout -- how frequently ZFS flushes data
vfs.zfs.txg.write_limit_override -- limits how fast ZFS tries to write data

> I have a number of different systems. Memory is 24GB on each of the two
> large data systems, SSD (Revo) for cache, and a SATA II ZIL. One system i=
s a
> 6 core i7 @ 3.33 GHz and the other 4 core ii7 @ 2.93 GHz. The arrays are
> RAIDz using cheap 2TB disks.

For gzip-9 even 6-core i7 may still be a bottleneck.
On a similar system I have gzip only gives me about 12MB/s per core. 6
cores would be barely enough to keep one disk busy.

--Artem



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFqOu6juMVicHUOtQDBQxzjpAcvuRFTfV%2BtiOx5BSnOoshOUZA>