Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 Feb 2012 10:23:05 +1100
From:      Morgan Reed <morgan.s.reed@gmail.com>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: ZFS panics on pool moved from OpenSolaris
Message-ID:  <CAKnh_YuipDfRp3NPBXqQY%2BXpuo8B=dmPGb-x2m0LiHMOT7k-2A@mail.gmail.com>
In-Reply-To: <4F2E65EC.60107@FreeBSD.org>
References:  <CAKnh_YuL8NfAtaMSLZASHxBsP=UHWvV=haMNDFYT1A2SGFR4BQ@mail.gmail.com> <4F2E65EC.60107@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Andriy,

          Thanks for that, the patch has significantly improved
matters, I'm now able to run a find across part of the drive without
issue, however I'm still seeing the panics on some directories, stack
trace below;

panic: avl_find()  succeeded inside avl_add()
cpuid =3D 0
KDB: stack backtrace:
#0 0xc0a4b157 at kdb_backtrace+0x47
#1 0xc0a186b7 at panic+0x117
#2 0xc5a2d7b2 at avl_add+0x52
#3 0xc5ac44e6 at zfs_fuid_table_load+0x1f6
#4 0xc5ac479e at zfs_fuid_init+0x14e
#5 0xc5ac4893 at zfs_fuid_find_by_idx+0xc3
#6 0xc5ac48ed at zfs_fuid_map_id+0x2d
#7 0xc5ac492f at zfs_groupmember+0x2f
#8 0xc5adbdcb at zfs_zaccess_aces_check+0x1db
#9 0xc5adc257 at zfs_zaccess+0xb7
#10 0xc5afa7d4 at zfs_freebsd_getattr+0x1f4
#11 0xc0d69322 at VOP_GETATTR_APV+0x42
#12 0xc0ab81c9 at vn_stat+0x79
#13 0xc0aaefdd at kern_statat_vnhook+0xfd
#14 0xc0aaf1cc at kern_statat+0x3c
#15 0xc0aaf156 at kern_lstat+0x36
#16 0xc0aaf1ff at sys_lstat+0x2f
#17 0xc0d49315 at syscall+0x355

Same trace as previously on 9.0.

I'm following your advice to Karli and throwing some printfs into
zfs_fuid_table_load,  I'll advise if I find anything enlightening.

Thanks,

Morgan

On Sun, Feb 5, 2012 at 22:20, Andriy Gapon <avg@freebsd.org> wrote:
>
> Please see this thread:
> http://lists.freebsd.org/pipermail/freebsd-fs/2011-December/013215.html
> It looks like the same issue.
> The patch has been committed in head, not sure if it's MFCed.
>
> on 05/02/2012 06:57 Morgan Reed said the following:
>> Hi all,
>>
>> =A0 =A0 =A0I'm experiencing an issue in migrating my NAS from OpenSolari=
s
>> over to FreeBSD, I've tried both releng_8_2 and releng_9 I have
>> similar issues in both cases.
>>
>> The pool is a RAID-Z pool comprising 4 1TB drives, it was originally
>> created on OpenSolaris (not sure what version, 2010.09 maybe, it was
>> one of the last ones prior to the Oracle acquisition), pool was a V14
>> pool, initially I built a FreeBSD-8.2 system to migrate the pool to,
>> migrated it over OK, upgraded it from V14 to V15, but later testing
>> revealed something wasn't happy, when listing certain directories (and
>> even doing an ls -la at the root of the pool) resulted in a kernel
>> panic (Mostly GENERIC kernel, rebuilt with KVA_PAGES 512 but other
>> than that stock);
>>
>> panic: avl_find() =A0succeeded inside avl_add()
>> cpuid =3D 0
>> KDB: stack backtrace:
>> #0 0x808e0d07 at kdb_backtrace+0x47
>> #1 0x808b1dc7 at panic+0x117
>> #2 0x862e6602 at avl_add+0x52
>> #3 0x8635c136 at zfs_fuid_table_load+0x1f6
>> #4 0x8635c3ee at zfs_fuid_init+0x14e
>> #5 0x8635c4d7 at zfs_fuid_find_by_idx+0xb7
>> #6 0x8635c52d at zfs_fuid_map_id+0x2d
>> #7 0x8635d56f at zfs_groupmember+0x2f
>> #8 0x8636df0b at zfs_zaccess_aces_check+0x1db
>> #9 0x8636377 at zfs_zaccess+0x57
>> #10 0x8636d6fb at zfs_zaccess_rwx+0x3b
>> #11 0x86385f61 at zfs_freebsd_access+0xf1
>> #12 0x80c02ea2 at VOP_ACCESS_APV+0x42
>> #13 0x809457cf at change_dir+0x5f
>> #14 0x809467b1 at kern_chdir+0x81
>> #15 0x80946a22 at chdir+0x22
>> #16 0x808eca39 at syscallenter+0x329
>> #17 0x80be4e14 at syscall+0x34
>>
>> Looks like something in the permissions structure was causing grief,
>> tried running a scrub across the pool, didn't resolve the issue.
>>
>> After spending some time fighting with it I decided that it wasn't
>> worth the effort, and I upgraded to FreeBSD-9.0 to see if that would
>> assist (I normally avoid x.0 releases), once again pool imported fine,
>> however I was still seeing similar panics, ran a scrub across the
>> pool, still not happy, also upgraded the pool to v28 tried again, when
>> that failed I scrubbed again but still no joy.
>>
>> As a matter of interest I booted an OpenIndiana live CD and tried
>> copying the directories contents to another location, I am now able to
>> list the directories. However there are still issues.
>>
>> The issue seems to have shifted slightly, stack trace from a recent
>> panic is below (GENERIC kernel on 9.0-RELEASE);
>>
>> panic: avl_find() =A0succeeded inside avl_add()
>> cpuid =3D 0
>> KDB: stack backtrace:
>> #0 0xc0a4b157 at kdb_backtrace+0x47
>> #1 0xc0a186b7 at panic+0x117
>> #2 0xc5a2d7b2 at avl_add+0x52
>> #3 0xc5ac44e6 at zfs_fuid_table_load+0x1f6
>> #4 0xc5ac479e at zfs_fuid_init+0x14e
>> #5 0xc5ac4893 at zfs_fuid_find_by_idx+0xc3
>> #6 0xc5ac48ed at zfs_fuid_map_id+0x2d
>> #7 0xc5ac492f at zfs_groupmember+0x2f
>> #8 0xc5adbdcb at zfs_zaccess_aces_check+0x1db
>> #9 0xc5adc257 at zfs_zaccess+0xb7
>> #10 0xc5afa7d4 at zfs_freebsd_getattr+0x1f4
>> #11 0xc0d69322 at VOP_GETATTR_APV+0x42
>> #12 0xc0ab81c9 at vn_stat+0x79
>> #13 0xc0aaefdd at kern_statat_vnhook+0xfd
>> #14 0xc0aaf1cc at kern_statat+0x3c
>> #15 0xc0aaf156 at kern_lstat+0x36
>> #16 0xc0aaf1ff at sys_lstat+0x2f
>> #17 0xc0d49315 at syscall+0x355
>>
>> This time it appears to be related to some extended attribute(s), I
>> can do an ls on one of the directories in question but an ls -la
>> causes a panic, so it would seem that it's some attribute which is
>> only shown in the long form of the ls output that is causing the
>> issue.
>>
>> I've done some digging around via the magic of google and this seems
>> to be a fairly common issue, but I've not found a solution for it
>> (barring copying the data off, recreating the pool and restoring the
>> data, I'd like to avoid this if at all possible.
>>
>> If I could determine what the problematic attribute was and a means to
>> strip it (be that from FreeBSD or from an OpenIndiana liveCD) I think
>> that will get me back up and running.
>>
>> If anybody can provide some suggestions as to what I may be able to do
>> to resolve this issue in situ I would be very grateful.
>
>
>
> --
> Andriy Gapon



--=20
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-- Benjamin Franklin, 1759



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAKnh_YuipDfRp3NPBXqQY%2BXpuo8B=dmPGb-x2m0LiHMOT7k-2A>