From owner-freebsd-fs@FreeBSD.ORG Fri Jul 15 22:02:28 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50533106564A for ; Fri, 15 Jul 2011 22:02:28 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id CCE718FC0C for ; Fri, 15 Jul 2011 22:02:27 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 433AF154963; Sat, 16 Jul 2011 00:02:26 +0200 (CEST) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 5YSkXkVEN8U2; Sat, 16 Jul 2011 00:02:21 +0200 (CEST) Received: from [10.9.8.3] (chello085216231078.chello.sk [85.216.231.78]) by mail.vx.sk (Postfix) with ESMTPSA id C7A1C15494D; Sat, 16 Jul 2011 00:02:20 +0200 (CEST) Message-ID: <4E20B8F3.5060603@FreeBSD.org> Date: Sat, 16 Jul 2011 00:02:27 +0200 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Luke Marsden References: <1310733049.26698.69.camel@behemoth> In-Reply-To: <1310733049.26698.69.camel@behemoth> X-Enigmail-Version: 1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, tech@hybrid-logic.co.uk Subject: Re: Experiences with ZFS v28 - including deadlock X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2011 22:02:28 -0000 Hi Luke, regarding the incremental receive, does the mount happen even if using the "-u" option to the zfs receive command? The manpage for zfs (receive section) says: -u File system that is associated with the received stream is not mounted. Cheers, mm Dňa 15. 7. 2011 14:30, Luke Marsden wrote / napísal(a): > Hi all, > > Having just quite extensively tested the v28 patchset contained within > http://mfsbsd.vx.sk/iso/mfsbsd-se-8.2-zfsv28-amd64.iso (updated > 19.06.2011) I wanted to share my experiences in the hope that the issues > I encountered can be fixed before 8.3 ;-) > > The biggest issue was a DEADLOCK which occurs quite reliably with a > given sequence of events in short succession, on a chroot filesystem > with many snapshots and a MySQL socket and nullfs mounts inside it: > > 1. Force unmount the nullfs mounts which are mounted on top of it > 2. Close the MySQL socket in /tmp > 3. Force unmount the actual filesystem (even if there are open FDs) > 4. 'zfs rename' the filesystem into our 'trash' filesystem (which I > understand consists of a clone, promote and destroy) > > The entire ZFS subsystem then hangs on any new I/O. > > Here is a procstat of the zfs rename process which hangs after the force > unmount: > > 25674 100871 zfs initial thread mi_switch+0x176 > sleepq_wait+0x42 _cv_wait+0x129 txg_wait_synced+0x85 > dsl_sync_task_group_wait+0x128 dsl_sync_task_do+0x54 dsl_dir_rename+0x8f > dsl_dataset_rename+0x272 zfsdev_ioctl+0xe6 devfs_ioctl_f+0x7b kern_ioctl > +0x102 ioctl+0xfd syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 > > Unfortunately it's not easy to reproduce, it only seems to happen in an > environment which is under load with a lot of datasets and a lot of zfs > operations happening concurrently on other datasets. I spent two days > trying to reproduce it in self-contained test environments but had no > luck, so I'm now reporting it anyway. > > There were two other issues which came up: > > 1. http://www.freebsd.org/cgi/query-pr.cgi?pr=157728 - we worked > around this with a semaphore on 'zfs list' and 'zfs recv' so > they never ran simultaneously. > 2. After an incremental receive, v28 seems to like to mount the > filesystem even if it was unmounted at the start of the receive. > (Notably, on previous versions of ZFS, this only happened for > non-incremental receives where the filesystem was being created > by the receive -- incremental receives correctly left the > filesystem in the mount state it started in). This plays very > badly when the filesystem then gets modified before we can force > unmount it (which we do immediately), because in this case the > next receive operation will fail with "filesystem has > modifications" - which we handle, but it's expensive to do so on > every incremental receive. > > I had a conversation with jhell on IRC about this and he had this to > say: > > its happened twice before with ZFS basically a lock being held > and never free'd > something there is happening between the snapshots and datasets > though. seems that it for some reason is able to destroy the dataset > before it destroys all the snapshots properly > then tries to do the renaming of the snapshots and leads to a > lock not being free()'d or similar > > Maybe this can offer a hint for someone to go looking in the right > direction to solve this? > > Thank you for working on ZFS in FreeBSD! v15 is working very well for > us. > -- Martin Matuska FreeBSD committer http://blog.vx.sk