Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Dec 2013 14:50:16 +0100 (CET)
From:      krichy@tvnetwork.hu
To:        pjd@freebsd.org
Cc:        freebsd-fs@freebsd.org
Subject:   Re: kern/184677 (fwd)
Message-ID:  <alpine.DEB.2.10.1312171326520.7714@krichy.tvnetwork.hu>
In-Reply-To: <alpine.DEB.2.10.1312161647410.11599@krichy.tvnetwork.hu>
References:  <alpine.DEB.2.10.1312161647410.11599@krichy.tvnetwork.hu>

next in thread | previous in thread | raw e-mail | index | archive | help
Dear devs,

I will sum up my experience regarding the issue:

The sympton is that a concurrent 'zfs send -R' and some activity on the 
snapshot dir (or in the snapshot) may cause a deadlock.

After investigating the problem, I found that zfs send umounts the 
snapshots, and that causes the deadlock, so later I tested only with 
concurrent umount and the "activity". More later I found that listing the 
snapshots in .zfs/snapshot/ and unounting them can cause the found 
deadlock, so I used them for the tests. But for my surprise, instead of a 
deadlock, a recursive lock panic has arised.

The vnode for the ".zfs/snapshot/" directory contains zfs's 
zfsctl_snapdir_t structure (sdp). This contains a tree of mounted 
snapshots, and each entry (sep) contains the vnode of entry on which the 
snapshot is mounted on top (se_root). The strange is that the se_root 
member does not hold a reference for the vnode, just a simple pointer to 
it.

Upon entry lookup (zfsctl_snapdir_lookup()) the "snapshot" vnode is 
locked, the zfsctl_snapdir_t's tree is locked, and searched for the mount 
if it exists already. If it founds no entry, does the mount. In the case 
of an entry was found, the se_root member contains the vnode which the 
snapshot is mounted on. Thus, a reference is taken for it, and the 
traverse() call will resolve to the real root vnode of the mounted 
snapshot, returning it as locked. (Examining the traverse() code I've 
found that it did not follow FreeBSD's lock order recommendation described 
in sys/kern/vfs_subr.c.)

On the other way, when an umount is issued, the se_root vnode looses its 
last reference (as only the mountpoint holds one for it), it goes through 
the vinactive() path, to zfsctl_snapshot_inactive(). In FreeBSD this is 
called with a locked vnode, so this is a deadlock race condition. While 
zfsctl_snapdir_lookup() holds the mutex for the sdp tree, and traverse() 
tries to acquire the se_root, zfsctl_snapshot_inactive() holds the 
lock on se_root while tries to access the sdp lock.

The zfsctl_snapshot_inactive() has an if statement checking the 
v_usecount, which is incremented in zfsctl_snapdir_lookup(), but in that 
context it is not covered by VI_LOCK. And it seems to me that FreeBSD's 
vinactive() path assumes that the vnode remains inactive (as opposed to 
illumos, at least how i read the code). So zfsctl_snapshot_inactive() 
must free resources while in a locked state. I was a bit confused, and 
probably that is why the previously posted patch is as is.

Maybe if I had some clues on the directions of this problem, I could have 
worked more for a nicer, shorter solution.

Please someone comment on my post.

Regards,



Kojedzinszky Richard
Euronet Magyarorszag Informatikai Zrt.

On Mon, 16 Dec 2013, krichy@tvnetwork.hu wrote:

> Date: Mon, 16 Dec 2013 16:52:16 +0100 (CET)
> From: krichy@tvnetwork.hu
> To: pjd@freebsd.org
> Cc: freebsd-fs@freebsd.org
> Subject: Re: kern/184677 (fwd)
> 
> Dear PJD,
>
> I am a happy FreeBSD user, I am sure you've read my previous posts regarding 
> some issues in ZFS. Please give some advice for me, where to look for 
> solutions, or how could I help to resolve those issues.
>
> Regards,
> Kojedzinszky Richard
> Euronet Magyarorszag Informatikai Zrt.
>
> ---------- Forwarded message ----------
> Date: Mon, 16 Dec 2013 15:23:06 +0100 (CET)
> From: krichy@tvnetwork.hu
> To: freebsd-fs@freebsd.org
> Subject: Re: kern/184677
>
>
> Seems that pjd did a change which eliminated the zfsdev_state_lock on Fri Aug 
> 12 07:04:16 2011 +0000, which might introduced a new deadlock situation. Any 
> comments on this?
>
>
> Kojedzinszky Richard
> Euronet Magyarorszag Informatikai Zrt.
>
> On Mon, 16 Dec 2013, krichy@tvnetwork.hu wrote:
>
>> Date: Mon, 16 Dec 2013 11:08:11 +0100 (CET)
>> From: krichy@tvnetwork.hu
>> To: freebsd-fs@freebsd.org
>> Subject: kern/184677
>> 
>> Dear devs,
>> 
>> I've attached a patch, which makes the recursive lockmgr disappear, and 
>> makes the reported bug to disappear. I dont know if I followed any 
>> guidelines well, or not, but at least it works for me. Please some 
>> ZFS/FreeBSD fs expert review it, and fix it where it needed.
>> 
>> But unfortunately, my original problem is still not solved, maybe the same 
>> as Ryan's: 
>> http://lists.freebsd.org/pipermail/freebsd-fs/2013-December/018707.html
>> 
>> Tracing the problem down is that zfsctl_snapdir_lookup() tries to acquire 
>> spa_namespace_lock while when finishing a zfs send -R does a 
>> zfsdev_close(), and that also holds the same mutex. And this causes a 
>> deadlock scenario. I looked at illumos's code, and for some reason they use 
>> another mutex on zfsdev_close(), which therefore may not deadlock with 
>> zfsctl_snapdir_lookup(). But I am still investigating the problem.
>> 
>> I would like to help making ZFS more stable on freebsd also with its whole 
>> functionality. I would be very thankful if some expert would give some 
>> advice, how to solve these bugs. PJD, Steven, Xin?
>> 
>> Thanks in advance,
>> 
>> 
>> Kojedzinszky Richard
>> Euronet Magyarorszag Informatikai Zrt.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.DEB.2.10.1312171326520.7714>