Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Oct 2012 12:32:51 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org>, Sean Chittenden <sean@chittenden.org>
Subject:   Re: ZFS crashing during snapdir lookup for non-existent snapshot...
Message-ID:  <50769243.2010208@FreeBSD.org>
In-Reply-To: <5075FA8E.10200@FreeBSD.org>
References:  <B244C0E9-539D-4F7C-8616-378E8469F4BB@chittenden.org> <5075E3E0.7060706@FreeBSD.org> <0A6567E7-3BA5-4F27-AEB2-1C00EDE00641@chittenden.org> <5075EDDD.4030008@FreeBSD.org> <A1901AB5-6E83-488E-9D29-EA7C4E3720F3@chittenden.org> <5075FA8E.10200@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 11/10/2012 01:45 Andriy Gapon said the following:
> 
> [restoring mailing list cc]
> 
> on 11/10/2012 00:58 Sean Chittenden said the following:
>>>> I don't have a dump from this particular system, only the backtrace from the crash. The system is ZFS only and I only have a ZFS swapdir. :-/
>>>>
>>>> I have the kernel still so I can poke at the code and the compiled kernel (kernel.symbols). ? What are you looking for? -sc
>>>>
>>>
>>> list *zfsctl_snapdir_lookup+0x124 in kgdb
>>
>> (kgdb) list *zfsctl_snapdir_lookup+0x124
>> 0xffffffff816e9384 is in zfsctl_snapdir_lookup (/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c:992).
>> 987				*direntflags = ED_CASE_CONFLICT;
>> 988	#endif
>> 989		}
>> 990	
>> 991		mutex_enter(&sdp->sd_lock);
>> 992		search.se_name = (char *)nm;
>> 993		if ((sep = avl_find(&sdp->sd_snaps, &search, &where)) != NULL) {
>> 994			*vpp = sep->se_root;
>> 995			VN_HOLD(*vpp);
>> 996			err = traverse(vpp, LK_EXCLUSIVE | LK_RETRY);
> 
> It seems that the problem is in Solaris-ism that remained in the code.
> I think that zfsctl_snapdir_inactive should not destroy sdp, that should be a
> job of vop_reclaim.  Otherwise, if the vnode is re-activated its v_data points
> to freed memory.
> 


Particularly I have this scenario in mind:
- one thread, T1, performs a vput-ish operation which leads to vop_inactive on a
current vnode that represents ".zfs/snapshot"
- at the same time T2 executes a lookup that goes into zfsctl_root_lookup
- let's assume that at some point T1 is at the very start of
zfsctl_snapdir_inactive, it holds just a vnode lock
- at the same time T2 is in gfs_dir_lookup->gfs_dir_lookup_static and it has
gfs_dir_lock
- so T2 finds the 'snapshot' static entry in gfsd_static[]
- T2 finds the cached vnode and adds a reference
- T2 does gfs_dir_unlock and returns the vnode
- now T1 proceeds through zfsctl_snapdir_inactive and destroys the v_data (but
without clearing the pointer, even)
- T2 uses the vnode and gets a crash

Possible resolutions:
- make vop_inactive a noop and make vop_reclaim call the current inactive methods
- check v_usecount in gfs_file_inactive after gfs_dir_lock is obtained and bail
out if it is > 0 (somewhat similar to what zfs_zinactive does)
- something else?

Easy way to reproduce the problem in one way or another - run many of the
following in parallel:
while true; do ls -l /pool/fs/.zfs/ >/dev/null; done

Here is another panic that is a variation of the above scenario.  Duplicate
gfs_vop_inactive is called after a "harmless" vop_pathconf call (doesn't touch a
vnode).  In this case the "shares" entry appears to be a random victim:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x18
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff825fe7dd
stack pointer           = 0x28:0xffffff80e040b800
frame pointer           = 0x28:0xffffff80e040b830
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, IOPL = 0
current process         = 712 (ls)
trap number             = 12
panic: page fault
cpuid = 1
curthread: 0xfffffe0003d8a9a0
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802d2bba = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff805596fa = kdb_backtrace+0x3a
panic() at 0xffffffff8051c2a6 = panic+0x266
trap_fatal() at 0xffffffff8070741d = trap_fatal+0x3ad
trap_pfault() at 0xffffffff8070756c = trap_pfault+0x12c
trap() at 0xffffffff80707d19 = trap+0x4f9
calltrap() at 0xffffffff806ef903 = calltrap+0x8
--- trap 0xc, rip = 0xffffffff825fe7dd, rsp = 0xffffff80e040b800, rbp =
0xffffff80e040b830 ---
gfs_vop_inactive() at 0xffffffff825fe7dd = gfs_vop_inactive+0x1d
VOP_INACTIVE_APV() at 0xffffffff80782fb4 = VOP_INACTIVE_APV+0x114
vinactive() at 0xffffffff805c84ad = vinactive+0x15d
vputx() at 0xffffffff805ca962 = vputx+0x4d2
vput() at 0xffffffff805ca9ce = vput+0xe
kern_pathconf() at 0xffffffff805cd44e = kern_pathconf+0x10e
sys_lpathconf() at 0xffffffff805cd4aa = sys_lpathconf+0x1a
amd64_syscall() at 0xffffffff80706953 = amd64_syscall+0x313
Xfast_syscall() at 0xffffffff806efbe7 = Xfast_syscall+0xf7
-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50769243.2010208>