From owner-freebsd-fs@FreeBSD.ORG  Fri Jul 15 22:02:28 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 50533106564A
	for <freebsd-fs@freebsd.org>; Fri, 15 Jul 2011 22:02:28 +0000 (UTC)
	(envelope-from mm@FreeBSD.org)
Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3])
	by mx1.freebsd.org (Postfix) with ESMTP id CCE718FC0C
	for <freebsd-fs@freebsd.org>; Fri, 15 Jul 2011 22:02:27 +0000 (UTC)
Received: from core.vx.sk (localhost [127.0.0.1])
	by mail.vx.sk (Postfix) with ESMTP id 433AF154963;
	Sat, 16 Jul 2011 00:02:26 +0200 (CEST)
X-Virus-Scanned: amavisd-new at mail.vx.sk
Received: from mail.vx.sk ([127.0.0.1])
	by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id 5YSkXkVEN8U2; Sat, 16 Jul 2011 00:02:21 +0200 (CEST)
Received: from [10.9.8.3] (chello085216231078.chello.sk [85.216.231.78])
	by mail.vx.sk (Postfix) with ESMTPSA id C7A1C15494D;
	Sat, 16 Jul 2011 00:02:20 +0200 (CEST)
Message-ID: <4E20B8F3.5060603@FreeBSD.org>
Date: Sat, 16 Jul 2011 00:02:27 +0200
From: Martin Matuska <mm@FreeBSD.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
	rv:5.0) Gecko/20110624 Thunderbird/5.0
MIME-Version: 1.0
To: Luke Marsden <luke-lists@hybrid-logic.co.uk>
References: <1310733049.26698.69.camel@behemoth>
In-Reply-To: <1310733049.26698.69.camel@behemoth>
X-Enigmail-Version: 1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@freebsd.org, tech@hybrid-logic.co.uk
Subject: Re: Experiences with ZFS v28 - including deadlock
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Jul 2011 22:02:28 -0000

Hi Luke,

regarding the incremental receive, does the mount happen even if  using
the "-u" option to the zfs receive command?

The manpage for zfs (receive section) says:

-u
File system that is associated with the received stream is  not mounted.

Cheers,
mm

Dňa 15. 7. 2011 14:30, Luke Marsden  wrote / napísal(a):
> Hi all,
>
> Having just quite extensively tested the v28 patchset contained within
> http://mfsbsd.vx.sk/iso/mfsbsd-se-8.2-zfsv28-amd64.iso (updated
> 19.06.2011) I wanted to share my experiences in the hope that the issues
> I encountered can be fixed before 8.3 ;-)
>
> The biggest issue was a DEADLOCK which occurs quite reliably with a
> given sequence of events in short succession, on a chroot filesystem
> with many snapshots and a MySQL socket and nullfs mounts inside it:
>
>      1. Force unmount the nullfs mounts which are mounted on top of it
>      2. Close the MySQL socket in /tmp
>      3. Force unmount the actual filesystem (even if there are open FDs)
>      4. 'zfs rename' the filesystem into our 'trash' filesystem (which I
>         understand consists of a clone, promote and destroy)
>
> The entire ZFS subsystem then hangs on any new I/O.
>
> Here is a procstat of the zfs rename process which hangs after the force
> unmount:
>
> 25674 100871 zfs              initial thread   mi_switch+0x176
> sleepq_wait+0x42 _cv_wait+0x129 txg_wait_synced+0x85
> dsl_sync_task_group_wait+0x128 dsl_sync_task_do+0x54 dsl_dir_rename+0x8f
> dsl_dataset_rename+0x272 zfsdev_ioctl+0xe6 devfs_ioctl_f+0x7b kern_ioctl
> +0x102 ioctl+0xfd syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 
>
> Unfortunately it's not easy to reproduce, it only seems to happen in an
> environment which is under load with a lot of datasets and a lot of zfs
> operations happening concurrently on other datasets.  I spent two days
> trying to reproduce it in self-contained test environments but had no
> luck, so I'm now reporting it anyway.
>
> There were two other issues which came up:
>
>      1. http://www.freebsd.org/cgi/query-pr.cgi?pr=157728 - we worked
>         around this with a semaphore on 'zfs list' and 'zfs recv' so
>         they never ran simultaneously.
>      2. After an incremental receive, v28 seems to like to mount the
>         filesystem even if it was unmounted at the start of the receive.
>         (Notably, on previous versions of ZFS, this only happened for
>         non-incremental receives where the filesystem was being created
>         by the receive -- incremental receives correctly left the
>         filesystem in the mount state it started in). This plays very
>         badly when the filesystem then gets modified before we can force
>         unmount it (which we do immediately), because in this case the
>         next receive operation will fail with "filesystem has
>         modifications" - which we handle, but it's expensive to do so on
>         every incremental receive.
>
> I had a conversation with jhell on IRC about this and he had this to
> say:
>
> <jhell> its happened twice before with ZFS basically a lock being held
> and never free'd
> <jhell> something there is happening between the snapshots and datasets
> though. seems that it for some reason is able to destroy the dataset
> before it destroys all the snapshots properly
> <jhell> then tries to do the renaming of the snapshots and leads to a
> lock not being free()'d or similar
>
> Maybe this can offer a hint for someone to go looking in the right
> direction to solve this?
>
> Thank you for working on ZFS in FreeBSD!  v15 is working very well for
> us.
>


-- 
Martin Matuska
FreeBSD committer
http://blog.vx.sk