From owner-freebsd-stable@FreeBSD.ORG Sun Nov 22 16:51:27 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F17F106566C for ; Sun, 22 Nov 2009 16:51:27 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mx.egr.msu.edu (surfnturf.egr.msu.edu [35.9.37.164]) by mx1.freebsd.org (Postfix) with ESMTP id 5AFFE8FC14 for ; Sun, 22 Nov 2009 16:51:27 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mx.egr.msu.edu (Postfix) with ESMTP id 562E97A5B8; Sun, 22 Nov 2009 11:51:26 -0500 (EST) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mx.egr.msu.edu ([127.0.0.1]) by localhost (surfnturf.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0kxQT47RITAS; Sun, 22 Nov 2009 11:51:26 -0500 (EST) Received: from localhost (daemon.egr.msu.edu [35.9.44.65]) by mx.egr.msu.edu (Postfix) with ESMTP id 162B97A5B5; Sun, 22 Nov 2009 11:51:25 -0500 (EST) Received: by localhost (Postfix, from userid 21281) id ED094286; Sun, 22 Nov 2009 11:51:25 -0500 (EST) Date: Sun, 22 Nov 2009 11:51:25 -0500 From: Adam McDougall To: "Svein Skogen (listmail account)" Message-ID: <20091122165125.GN1213@egr.msu.edu> References: <4B066B13.1070006@freebsd.org> <4b07ac59.A2Afaf4X0IZlrgGU%perryh@pluto.rain.com> <57200BF94E69E54880C9BB1AF714BBCBA5722E@w2003s01.double-l.local> <20091121193643.GA14122@icarus.home.lan> <20091122052030.GL1213@egr.msu.edu> <4B08FD93.4070409@stillbilde.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B08FD93.4070409@stillbilde.net> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org Subject: Re: 7.2 dies in zfs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Nov 2009 16:51:27 -0000 On Sun, Nov 22, 2009 at 10:00:03AM +0100, Svein Skogen (listmail account) wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Adam McDougall wrote: > On Sat, Nov 21, 2009 at 11:36:43AM -0800, Jeremy Chadwick wrote: > > > On Sat, Nov 21, 2009 at 08:07:40PM +0100, Johan Hendriks wrote: > > Randy Bush wrote: > > > imiho, zfs can not be called production ready if it crashes if you > > > do not stand on your left leg, put your right hand in the air, and > > > burn some eye of newt. > > > > This is not a rant, but where do you read that on FreeBSD 7.2 ZFS has > > been marked as production ready. > > As far as i know, on FreeBSD 8.0 ZFS is called production ready. > > > > If you boot your system it probably tell you it is still experimental. > > > > Try running FreeBSD 7-Stable to get the latest ZFS version which on > > FreeBSD is 13 > > On 7.2 it is still at 6 (if I remember it right). > > RELENG_7 uses ZFS v13, RELENG_8 uses ZFS v18. > > RELENG_7 and RELENG_8 both, more or less, behave the same way with > regards to ZFS. Both panic on kmem exhaustion. No one has answered my > question as far as what's needed to stabilise ZFS on either 7.x or 8.x. > > I have a stable public ftp/http/rsync/cvsupd mirror that runs ZFS v13. > It has been stable since mid may. I have not had a kmem panic on any > of my ZFS systems for a long time, its a matter of making sure there is > enough kmem at boot (not depending on kmem_size_max) and that it is big enough > that fragmentation does not cause a premature allocation failure due to lack > of large-enough contiguous chunk. This requires the platform to support a > kmem size that is "big enough"... i386 can barely muster 1.6G and sometimes > that might not be enough. I'm pretty sure all of my currently existing ZFS > systems are amd64 where the kmem can now be huge. On the busy fileserver with > 20 gigs of ram running FreeBSD 8.0-RC2 #21: Tue Oct 27 21:45:41 EDT 2009, > I currently have: > vfs.zfs.arc_max=16384M > vfs.zfs.arc_min=4096M > vm.kmem_size=18G > The arc settings here are to try to encourage it to favor the arc cache > instead of whatever else Inactive memory in 'top' contains. Very interesting. For my iscsi backend (running istgt from ports), I had to change the arc_max below 128M to stop iSCSI initiators generating timeouts when the cache flushed. (This is on a system with a megaraid 8308ELP handling the disk back end, with the disks in two RAID5 arrays of four disks each, zpooled as one big pool). When I had more than 128M arc_max, zfs on regular times ate all available resources to flush to disk, leaving the istgt waiting, and iSCSI initiators timed out and had to reconnect. The iSCSI initiators are the built-in software initator in VMWare ESX 4i. //Svein I could understand that happening. I've seen situations in the past where my kmem was smaller than I wanted it to be, and within a few days the overall ZFS disk IO would become incredibly slow because it was trying to flush out the ARC way too often because of external intense memory pressure on the ARC. Assuming you have a large amount of ram, I wonder if setting kmem_size, arc_min and arc_max sufficiently large and using modern code would help as long as you made sure other processes on the machine don't squeeze down Wired memory in top too much. In such a situation, I would expect it to operate fine while the ARC has enough kmem to expand as much as it wants to, and it might either hit a wall later or perhaps given enough ARC the reclamation might be tolerable. Or, if 128M ARC is good enough for you, leave it :)