From owner-freebsd-stable@FreeBSD.ORG Sun Nov 22 01:11:51 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 85AA6106568D for ; Sun, 22 Nov 2009 01:11:51 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA13.emeryville.ca.mail.comcast.net (qmta13.emeryville.ca.mail.comcast.net [76.96.27.243]) by mx1.freebsd.org (Postfix) with ESMTP id 6813E8FC0A for ; Sun, 22 Nov 2009 01:11:50 +0000 (UTC) Received: from OMTA02.emeryville.ca.mail.comcast.net ([76.96.30.19]) by QMTA13.emeryville.ca.mail.comcast.net with comcast id 80M81d0050QkzPwAD1BrST; Sun, 22 Nov 2009 01:11:51 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by OMTA02.emeryville.ca.mail.comcast.net with comcast id 81Bq1d00D3S48mS8N1BrZ5; Sun, 22 Nov 2009 01:11:51 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id CEC1C1E3035; Sat, 21 Nov 2009 17:11:49 -0800 (PST) Date: Sat, 21 Nov 2009 17:11:49 -0800 From: Jeremy Chadwick To: freebsd-stable@freebsd.org Message-ID: <20091122011149.GA19922@icarus.home.lan> References: <4B066B13.1070006@freebsd.org> <4b07ac59.A2Afaf4X0IZlrgGU%perryh@pluto.rain.com> <57200BF94E69E54880C9BB1AF714BBCBA5722E@w2003s01.double-l.local> <20091121193643.GA14122@icarus.home.lan> <790a9fff0911211159k14920410g7a76cf6a292f0bae@mail.gmail.com> <20091122002926.GA19628@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20091122002926.GA19628@icarus.home.lan> User-Agent: Mutt/1.5.20 (2009-06-14) Subject: Re: 7.2 dies in zfs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Nov 2009 01:11:51 -0000 On Sat, Nov 21, 2009 at 04:29:26PM -0800, Jeremy Chadwick wrote: > On Sat, Nov 21, 2009 at 01:59:11PM -0600, Scot Hetzel wrote: > > > RELENG_7 and RELENG_8 both, more or less, behave the same way with > > > regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my > > > question as far as what's needed to stabilise ZFS on either 7.x or 8.x. > > > > > Under RELENG_8/i386, you still need to tune ZFS as mentioned in the > > ZFS Tuning Guide: > > > > http://wiki.freebsd.org/ZFSTuningGuide > > > > With RELENG_8/amd64 no tuning is necessary, if the system has at least 2G RAM. > > Nope. > > http://lists.freebsd.org/pipermail/freebsd-stable/2009-October/052256.html I'll expand briefly on this because my post mentioned RELENG_7, and the "state" of ZFS in RELENG_7 vs. RELENG_8 vs. HEAD is hard to follow because some of the commits to (what once was) HEAD are actually in RELENG_8 given when HEAD was tagged as RELENG_8. There's a particular situation (with patch for RELENG_8) that has been "making the rounds": http://lists.freebsd.org/pipermail/freebsd-fs/2009-October/006907.html http://lists.freebsd.org/pipermail/freebsd-fs/2009-October/006969.html The discussion is with regards to slow performance as a result of ARC degrading, except numerous posters (including the OP) mention that their box also can "just hang". But this patch seems different than the one which got committed to HEAD (what is CURRENT today); revision 1.25 -- Commit message: Prevent paging pressure from draining arc too much - always drain arc if above arc_c_max - never drain arc if arc is below arc_c_max http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c.diff?r1=1.24;r2=1.25;f=h This commit is not in RELENG_8 nor RELENG_7 (I've confirmed by looking at sources), and of course the patch is "?!?" given the nature of the thread. I've looked at SVN commits to HEAD and Kip has been very, very busy (even today). :-) .....but then there's this commit, which happened ~5 months ago, and made it into HEAD at the time (thus is in RELENG_8; also verified by looking at source): Commit message: Manually export rev 192360 from kmacy http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c.diff?r1=1.19;r2=1.20;f=h ...Which I don't understand technically, but appears to have a direct effect on ARC limiting. So, this is getting very hard to track/follow. Circling back to kmem exhaustion: has there been any official statement on what's actually causing it? Is it ARC overuse (and if so how's that even possible)? Is it ZIL? Is it a combination of things? Is it bugs in the ZFS port (e.g. Solaris VM vs. FreeBSD VM)? Is it all of these things? And ultimately -- how do we work around it? With regards to loader.conf tuning, because this comes up often too: There still has been no official or even semi-official (e.g. Wiki) explanation as far as what should be tuned, and HOW things should be tuned. What are the proper variables to tune this? Tuning on RELENG_7 vs. RELENG_8 also probably differs at this point in time -- or does it? The following loader.conf variables are under scrutiny: vm.kmem_size vm.kmem_size_max vfs.zfs.arc_min vfs.zfs.arc_max vfs.zfs.prefetch_disable vfs.zfs.zil_disable The number of conflicting details on the mailing lists (freebsd-stable, freebsd-current, and freebsd-fs) make it very hard to discern at this point how one is supposed to tune loader.conf to gain stability. For example, I've seen pjd@ mention that one should NOT be touching vm.kmem_size_max, but rather vm.kmem_size -- which I don't understand (and I mean that as in "help me understand", not "I'm questioning the logic"), especially since src/UPDATING states "you probably don't need to adjust either of these". This is why we need people who are familiar with both the ZFS code and the VM to help provide details so that documentation can be updated (I'm referring to the Wiki). If we could get something official from people who are "in the know", that would be awesome. Or maybe this is the wrong list to be discussing it at all, and freebsd-fs is? I don't know any more... It's almost like we need some kind of "ZFS on FreeBSD" newsletter that's sent out weekly documenting all of what's getting changed and what it solves and how it impacts users. Things are totally chaotic right now. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |