From owner-freebsd-fs@FreeBSD.ORG Wed Mar 24 17:23:50 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C7B7106566B; Wed, 24 Mar 2010 17:23:50 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 10E6B8FC15; Wed, 24 Mar 2010 17:23:49 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id o2OHNn9v013305; Wed, 24 Mar 2010 12:23:49 -0500 (CDT) Date: Wed, 24 Mar 2010 12:23:49 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Dan Naumov In-Reply-To: Message-ID: References: User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Wed, 24 Mar 2010 12:23:49 -0500 (CDT) Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: tuning vfs.zfs.vdev.max_pending and solving the issue of ZFS writes choking read IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Mar 2010 17:23:50 -0000 On Wed, 24 Mar 2010, Dan Naumov wrote: > Has anyone done any extensive testing of the effects of tuning > vfs.zfs.vdev.max_pending on this issue? Is there some universally > recommended value beyond the default 35? Anything else I should be > looking at? The vdev.max_pending value is primarily used to tune for SAN/HW-RAID LUNs and is used to dial down LUN service time (svc_t) values by limiting the number of pending requests. It is not terribly useful for decreasing stalls due to zfs writes. In order to reduce the impact of zfs writes, you want to limit the maximum size of a zfs transaction group (TXG). I don't know what the FreeBSD tunable is for this, but under Solaris it is zfs:zfs_write_limit_override. On a large-memory system, a properly working zfs should not saturate the write channel for more than 5 seconds. Zfs tries to learn the write bandwidth so that it can tune the TXG size up to 5 seconds (max) worth of writes. If you have both large memory and fast storage, quite a huge amount of data can be written in 5 seconds. On my Solaris system, I found that zfs was quite accurate with its rate estimation, but it resulted in four gigabytes of data being written per TXG. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/