Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Nov 2012 07:07:41 -0700
From:      Josh Beard <josh@signalboxes.net>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS: Panic when attempting to delete certain data
Message-ID:  <CAHDrHSsVfeRSbSxe6W7qtHSfOi_DA6QXG38utKgQko7A49uoXQ@mail.gmail.com>
In-Reply-To: <50B61672.2070406@FreeBSD.org>
References:  <CAHDrHStcfSJ-9ueSV%2BFujEsmAK3zMX2CAGVD6Xz_2gJAThu5Kg@mail.gmail.com> <50B50B04.8020109@FreeBSD.org> <CAHDrHSteXO2pzkUtG0WHcgChsyXiBk2of4P0gR7ccY5srWNdew@mail.gmail.com> <50B52CEC.9080208@FreeBSD.org> <CAHDrHSvsrwx5qEimdx7e6eHoXTLwQ1AeH80Nx8izA7BTfFb8Jg@mail.gmail.com> <50B5CFAF.8070306@FreeBSD.org> <CAHDrHSv8Tv6Yr9z0PzWFFOqZPLiTN9EkH6nQsEMr48sz%2Bsaizg@mail.gmail.com> <50B61672.2070406@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 28, 2012 at 6:49 AM, Andriy Gapon <avg@freebsd.org> wrote:

> on 28/11/2012 15:41 Josh Beard said the following:
> >
> >
> > On Wed, Nov 28, 2012 at 1:47 AM, Andriy Gapon <avg@freebsd.org
> > <mailto:avg@freebsd.org>> wrote:
> >
> >     on 27/11/2012 23:47 Josh Beard said the following:
> >     > Thanks!  Here we go:
> >     >
> >     > (kgdb) frame 7
> >     > #7  0xffffffff80ebd45a in zfs_freebsd_remove (ap=Variable "ap" is
> not available.
> >     > ) at
> >     >
> >
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1855
> >     > 1855                    dmu_tx_hold_sa(tx, xzp->z_sa_hdl, B_FALSE);
> >     > (kgdb) list
> >     > 1850                &xattr_obj, sizeof (xattr_obj));
> >     > 1851            if (error == 0 && xattr_obj) {
> >     > 1852                    error = zfs_zget(zfsvfs, xattr_obj, &xzp);
> >     > 1853                    ASSERT3U(error, ==, 0);
> >     > 1854                    dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_TRUE);
> >     > 1855                    dmu_tx_hold_sa(tx, xzp->z_sa_hdl, B_FALSE);
> >     > 1856            }
> >     > 1857
> >     > 1858            mutex_enter(&zp->z_lock);
> >     > 1859            if ((acl_obj = zfs_external_acl(zp)) != 0 &&
> may_delete_now)
> >
> >     That's what I suspected.
> >
> >     > (kgdb) frame 7
> >     > #7  0xffffffff80ebd45a in zfs_freebsd_remove (ap=Variable "ap" is
> not available.
> >     > ) at
> >     >
> >
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1855
> >     > 1855                    dmu_tx_hold_sa(tx, xzp->z_sa_hdl, B_FALSE);
> >     > (kgdb) info local
> >     > No locals.
> >     > (kgdb)
> >
> >     A little bit unfortunate.
> >
> >     > # ls -Pi /DSDK12_NHR.pax.gz  (symlink to ../Archive.pax.gz)
> >     > 249868
> >     >
> >
> ./Imaging/Packages/DSDK12_NHR_2012-02-23.pkg/Contents/Resources/DSDK12_NHR.pax.gz
> >     >
> >     > # zdb -ddddd store/tdxs1 249868
> >     > Dataset store/tdxs1 [ZPL], ID 109, cr_txg 35014, 1.33T, 1106389
> objects, rootbp
> >     > DVA[0]=<0:8000204400:400> DVA[1]=<0:30800644400:400> [L0 DMU
> objset] fletcher4
> >     > lzjb LE contiguous unique double size=800L/200P
> birth=1167710L/1167710P
> >     > fill=1106389
> cksum=1966704b59:757ae6cb615:134bfd597bca9:254b2ee348393d
> >     >
> >     >     Object  lvl   iblk   dblk  dsize  lsize   %full  type
> >     >     249868    1    16K    512      0    512    0.00  ZFS plain file
> >     >                                         201   bonus  System
> attributes
> >     >         dnode flags: USERUSED_ACCOUNTED
> >     >         dnode maxblkid: 0
> >     >         path
> >     >
> >
>  /tech/2012-09-14-01-00/Imaging/Packages/DSDK12_NHR_2012-02-23.pkg/Contents/Resources/DSDK12_NHR.pax.gz
> >     >         uid     300002
> >     >         gid     80
> >     >         atime   Tue Nov 27 14:43:00 2012
> >     >         mtime   Thu Feb 23 08:59:21 2012
> >     >         ctime   Fri Sep 14 01:12:37 2012
> >     >         crtime  Fri Sep 14 01:11:50 2012
> >     >         gen     81430
> >     >         mode    120755
> >     >         size    17
> >     >         parent  249866
> >     >         links   1
> >     >         pflags  40800000104
> >     >         xattr   230
> >
> >
> > # zdb -ddddd store/tdxs1 230
> > Dataset store/tdxs1 [ZPL], ID 109, cr_txg 35014, 1.33T, 1106389 objects,
> rootbp
> > DVA[0]=<0:8000284800:400> DVA[1]=<0:308006b3800:400> [L0 DMU objset]
> fletcher4
> > lzjb LE contiguous unique double size=800L/200P birth=1167748L/1167748P
> > fill=1106389 cksum=16e1e08cb8:70a50f1ec5a:13419c71d1cda:260fbed28af8f5
> >
> >     Object  lvl   iblk   dblk  dsize  lsize   %full  type
> > zdb: dmu_bonus_hold(230) failed, errno 2
> >
>
> errno 2 is ENOENT, so all pieces match.
> The file appears to have extended attributes, but they do not actually
> exist.
> If your ZFS module was built with debug, then ASSERT3U(error, ==, 0) would
> be
> triggered.
>
> This is something that I said earlier in the private conversation that I
> mentioned:
>
> Andriy Gapon said:
> > The relevant commits are r240632 and r240345.  I can't recall when I
> MFC-ed them
> > to stable/9.  Most likely they are not in 9.1 releng branch.
> [snip]
> > I also don't have a good advice on how to fix the existing corruption.
> > I'd probably go with using something like tar/cpio/pax to move data to
> fresh
> > filesystem.  Please be sure to use latest stable/9 or 8 to not run into
> the
> > issue again.
> >
> > zfs send/recv won't help, it would mindlessly replicate the corrupted
> attributes.
>
> To this I might add that the bugs were not FreeBSD-specifc and they may be
> still
> present in ZFS upstream and other ZFS ports.
>
> Thank you for the debugging information.
> --
> Andriy Gapon
>

Andriy,

Does this mean re-creating the dataset (or the pool?) would resolve this
under a recent stable/9 build would resolve this?

I have no problem doing so - as the data is redundant.

Thanks for your help on this.

Josh



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHDrHSsVfeRSbSxe6W7qtHSfOi_DA6QXG38utKgQko7A49uoXQ>