From owner-freebsd-fs@FreeBSD.ORG Tue Oct 11 23:59:55 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 83D141065670 for ; Tue, 11 Oct 2011 23:59:55 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 43F1C8FC0A for ; Tue, 11 Oct 2011 23:59:55 +0000 (UTC) Received: by ywp17 with SMTP id 17so163302ywp.13 for ; Tue, 11 Oct 2011 16:59:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=5OjcFc6+e9qvsEkiHYK+FK0ozO96Z0IDbgHLeJbucJM=; b=Fs2hvd1Nr1Zr59quEQdlxMubNgaSPCKziFj6OvtALvopIodBSKaDN+RsYw7Jh4DDIm 4ANS7xYXvdFU90fBBBlGK4DsRugFWqWzRjSaeLdLZJkYC+HMmJYW02dpWDImgPuLX7X0 uCiCZyCV9MDJJbrFsxEU6c54NeqTYnEQiSres= MIME-Version: 1.0 Received: by 10.236.186.35 with SMTP id v23mr34072727yhm.80.1318377594595; Tue, 11 Oct 2011 16:59:54 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.236.103.33 with HTTP; Tue, 11 Oct 2011 16:59:54 -0700 (PDT) In-Reply-To: References: Date: Tue, 11 Oct 2011 16:59:54 -0700 X-Google-Sender-Auth: q-wW1xSz5kqNYqIXpw3DpxyQLP4 Message-ID: From: Artem Belevich To: Dennis Glatting Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS/compression/performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Oct 2011 23:59:55 -0000 On Tue, Oct 11, 2011 at 4:25 PM, Dennis Glatting wrote: > I would appreciate someone knowledgeable in ZFS point me in the right > direction. > > I have several ZFS arrays, some using gzip for compression. The compresse= d > arrays hold very large text documents (10MB->20TB) and are highly > compressible. Reading the files from a compressed data sets is fast with > little load. However, writing to the compressed data sets incurs substant= ial > load on the order of a load average from 12 to 20. > > My questions are: > > 1) Why such a heavy load on writing? gzip compression is relatively slow, even gzip-1, even on the fast CPU, even with multiple cores. ZFS does per-block compression and spreads the load across multiple worker threads. Compression seems to happen when the data is being flushed to disk. There are multiple worker threads. What typically happens is that the data you write gets accumulated in ARC. Every 10 seconds (ZFSv28 default, I believe. Used to be 30 before) ZFS starts flushing whatever has been accumulated. Setting compression to its lowest setting (gzip-1) will help a bit. Getting fastest CPU(s) you can afford will help, too, because you will be hard pressed to compress data fast enough to saturate single HDD bandwidth, never mind multi-disk pool. Another option is to switch to lzjb compression. Compression level will be limited to ~2x, but it's pretty fast. > 2) What kind =A0of limiters can I put into effect to reduce load > =A0 without impacting compressibilty? For example, is there some > =A0 variable to controls the number of parallel compression > =A0 operations? If on average you write data faster than yoour CPU can compress it with a chosen compression settings, there's not much you can do. If CPU can keep up with writes in general, then there are few things you can to to prevent compression rush. Tinkering with following tunables may help: vfs.zfs.txg.timeout -- how frequently ZFS flushes data vfs.zfs.txg.write_limit_override -- limits how fast ZFS tries to write data > I have a number of different systems. Memory is 24GB on each of the two > large data systems, SSD (Revo) for cache, and a SATA II ZIL. One system i= s a > 6 core i7 @ 3.33 GHz and the other 4 core ii7 @ 2.93 GHz. The arrays are > RAIDz using cheap 2TB disks. For gzip-9 even 6-core i7 may still be a bottleneck. On a similar system I have gzip only gives me about 12MB/s per core. 6 cores would be barely enough to keep one disk busy. --Artem