From owner-freebsd-current@FreeBSD.ORG Sun Apr 12 16:48:38 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E97851065670; Sun, 12 Apr 2009 16:48:38 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from ns1.jnielsen.net (ns1.jnielsen.net [69.55.238.237]) by mx1.freebsd.org (Postfix) with ESMTP id CE9558FC0A; Sun, 12 Apr 2009 16:48:38 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from [192.168.213.128] (jn@stealth.jnielsen.net [74.218.226.254]) (authenticated bits=0) by ns1.jnielsen.net (8.12.9p2/8.12.9) with ESMTP id n3CGmaD1001878; Sun, 12 Apr 2009 12:48:38 -0400 (EDT) (envelope-from lists@jnielsen.net) From: John Nielsen To: freebsd-current@freebsd.org Date: Sun, 12 Apr 2009 12:48:36 -0400 User-Agent: KMail/1.9.10 References: <200903100104.53847.ken__6247.10998167775$1236647281$gmane$org@mthelicon.com> <200903112229.41052.lists@jnielsen.net> <49E16A89.3030108@freebsd.org> In-Reply-To: <49E16A89.3030108@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200904121248.36584.lists@jnielsen.net> X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on ns1.jnielsen.net X-Virus-Status: Clean Cc: Tim Kientzle Subject: Re: ZFS/extattr lockup (was Re: bsdtar lockup on Current-03/10/2009) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Apr 2009 16:48:39 -0000 On Sunday 12 April 2009 12:14:01 am Tim Kientzle wrote: > John Nielsen wrote: > > On Wednesday 11 March 2009 12:35:37 am Tim Kientzle wrote: > >> John Nielsen wrote: > >>> I today noticed the same problem on -CURRENT i386 built March 9. > >>> ... using ZFS and initially > >>> thought that was the source of the regression but I haven't > >>> produced the lockup with anything but tar and the extattr removal > >>> hack seems to have fixed it for now ... > >> > >> The common element so far seems to be ZFS. Can you verify that > >> > >> $ lsextattr -h user > >> > >> hangs on your system as well? That invokes the same > >> extattr_list_link system call used by tar to enumerate > >> the extended attributes on a file. > > > > Confirmed. I ran the command on a file in /root (UFS) with no > > problem. Running again on a file in /home/john (ZFS) caused the hang. > > John Baldwin committed a fix for the ZFS problem > (r189967, 2009-03-18 16:19:44UTC), so this should be fixed. > > Could you verify that lsextattr no longer hangs > the kernel after this point? > > Could you verify that re-enabling extended attribute > archiving in tar no longer hangs? > > I'd like to enable the extended attribute support > in tar for real if this is really fixed. Yes, I saw the commit and manually reverted libarchive to test on 3/20. I sent the post below but I don't know if it went through. Everything was solid then and has been since, so I think jhb's patch did the trick. JN ---------- Forwarded Message ---------- Subject: Re: repeatable ZFS panic: share->excl Date: Friday 20 March 2009 From: John Nielsen To: freebsd-current@freebsd.org On Wednesday 18 March 2009 12:09:18 pm John Baldwin wrote: > On Tuesday 17 March 2009 3:04:40 am Pawel Jakub Dawidek wrote: > > On Fri, Mar 13, 2009 at 02:08:03PM -0400, John Baldwin wrote: > > > John Baldwin wrote: > > > >Yes, I think that is the real bug. Looking at this further I > > > > think zfs_get_xattrdir() will return the vnode locked if it has > > > > to create a new node via zfs_make_attrdir() but only returns it > > > > held and unlocked if it finds an existing one. So my new patch > > > > is to just fix zfs_get_xattrdir() to unlock the vnode if it > > > > creates a new one like so: > > > > > > > >(Sorry, TBird is probably going to butcher all the whitespace): > > > > > > > >--- > > > >//depot/user/jhb/lock/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_d > >ir.c > > > > > >+++ > > > >/Users/jhb/work/p4/lock/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs > >_dir.c > > > > > >@@ -940,6 +940,7 @@ > > > > /* NB: we already did dmu_tx_wait() if necessary */ > > > > goto top; > > > > } > > > >+ VOP_UNLOCK(*xvpp, 0); > > > > > > > > return (error); > > > > } > > > > > > > >A non-butchered version is at > > > > www.FreeBSD.org/~jhb/patches/zfs_ea.patch. > > > > > > So lulf@ reports success with this patch. Pawel, can you review > > > it? > > > > Yes, it works for me too and looks good. The only thing we need to > > change is to check for error beeing 0 before unlocking the vnode. > > The zfs_make_xattrdir() function can still return with EIO, so I'd > > add something like this: > > > > if (error == 0) > > VOP_UNLOCK(*xvpp, 0); > > Yes, I realized this about 30 minutes after I sent this e-mail. :-P I > will commit a version with the error check today. > > > Thank you John for spending time on tracking this one down. > > Sure, was good to read a bit of the ZFS code. I had a chance to test this patch today and it looks good. Which is to say my system hasn't hung yet. :) I built world from 3/19 -HEAD and was able to "lsextattr -h user" on files on a ZFS without any ill effects. I un-patched libarchive (so it uses extended attributes again) and rebuilt and reinstalled it and bsdtar and was able to do portupgrades normally. Thanks to all involved. JN -------------------------------------------------------