From owner-freebsd-fs@FreeBSD.ORG Sun Oct 10 20:57:42 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B62B5106564A for ; Sun, 10 Oct 2010 20:57:42 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from fep23.mx.upcmail.net (fep23.mx.upcmail.net [62.179.121.43]) by mx1.freebsd.org (Postfix) with ESMTP id EEB078FC14 for ; Sun, 10 Oct 2010 20:57:41 +0000 (UTC) Received: from edge04.upcmail.net ([192.168.13.239]) by viefep15-int.chello.at (InterMail vM.8.01.02.02 201-2260-120-106-20100312) with ESMTP id <20101010203921.CBQB1472.viefep15-int.chello.at@edge04.upcmail.net>; Sun, 10 Oct 2010 22:39:21 +0200 Received: from pinky ([213.46.23.80]) by edge04.upcmail.net with edge id H8fK1f00m1jgp3H048fLiF; Sun, 10 Oct 2010 22:39:21 +0200 X-SourceIP: 213.46.23.80 Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: "Willem Jan Withagen" , "Jeremy Chadwick" References: <4CB1DD0F.6000209@digiware.nl> <20101010193415.GA93540@icarus.home.lan> Date: Sun, 10 Oct 2010 22:39:19 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Ronald Klop" Message-ID: In-Reply-To: <20101010193415.GA93540@icarus.home.lan> User-Agent: Opera Mail/10.62 (Win32) X-Cloudmark-Analysis: v=1.1 cv=O+FWVpunvrlG1gSnSO6WiIQ7o0MJ4laHqrEcUJ8XjIg= c=1 sm=0 a=bgpUlknNv7MA:10 a=kj9zAlcOel0A:10 a=QycZ5dHgAAAA:8 a=6I5d2MoRAAAA:8 a=5leI9aT5W7OPK-t9VcYA:9 a=jYG_2NK6JWUZdG3Nez0A:7 a=rG5oTdzxozAb1bwx_sDAbIEyCJ8A:4 a=CjuIK1q_8ugA:10 a=LEW0jtIvgjIA:10 a=HpAAvcLHHh0Zw7uRqdWCyQ==:117 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS freeze/livelock X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Oct 2010 20:57:42 -0000 On Sun, 10 Oct 2010 21:34:15 +0200, Jeremy Chadwick wrote: > On Sun, Oct 10, 2010 at 05:34:39PM +0200, Willem Jan Withagen wrote: >> Just had my FreeBSD freeze on me with what I would think is sort of >> an livelock.... >> >> While I was receiving zfs snapshots on my data pool. >> >> Top and systat just kept running, >> but anything getting near a shell (and perhaps disk-io) ended up in: >> >> root@zfs.digiware.nl# gpart create -s gpt da6 >> load: 0.00 cmd: csh 12393 [zfsvfs->z_teardown_inactive_lock] 26.12r >> 0.00u 0.00s 0% 2480k >> load: 0.10 cmd: csh 12393 [zfsvfs->z_teardown_inactive_lock] 96.01r >> 0.00u 0.00s 0% 2480k >> >> Trying to execute to execute shutdown -r now had no effect what so ever. >> Neither did the three-finger salute. >> (Well at least not in 60 sec I was willing to wait.) >> >> Only way out of this situation was hard-reset. And I do have to >> admit I like ZFS for the speed it recovers after unexpected reboot. >> >> To bad there was no alt-ctrl-backspace escape to debugger compiled >> in. I'll do that with the next kernel, just in case. >> >> So the only data point I can give is the ^T output above. > > We don't know what FreeBSD version you're using (specifically uname -a > output, since build date matters), but if it's RELENG_8 with ZFS v15, > you might check out this thread (be sure to read Kai and I's diagnoses): > > http://lists.freebsd.org/pipermail/freebsd-fs/2010-October/009687.html > > I'm in the process of moving all of my machines, including my home > server, over to gmirror. (Home machine started showing signs of serious > ZFS performance degredation; mutt doing a stat() on 24 files and > directories total taking literally 0.4 seconds on a dual-core machine. > Makes no sense, doesn't happen with UFS2, I'm done.) > Sorry to hear it didn't work out for you this time. But if you are running very important things on very fresh code you should make some testing stage or fail over to older versions available or be able to go back from backup or ... . At my company we roll out new minor version updates of mysql now and than, but always make sure we have an old version available. Our customers are more important than running the latest versions. Home machines are different with regards to having plenty of backup machines, but is it possible to give a developer an temporary account to debug this? That would help the project going forward. Ronald.