Date: Sat, 16 Jul 2011 00:02:27 +0200 From: Martin Matuska <mm@FreeBSD.org> To: Luke Marsden <luke-lists@hybrid-logic.co.uk> Cc: freebsd-fs@freebsd.org, tech@hybrid-logic.co.uk Subject: Re: Experiences with ZFS v28 - including deadlock Message-ID: <4E20B8F3.5060603@FreeBSD.org> In-Reply-To: <1310733049.26698.69.camel@behemoth> References: <1310733049.26698.69.camel@behemoth>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Luke, regarding the incremental receive, does the mount happen even if using the "-u" option to the zfs receive command? The manpage for zfs (receive section) says: -u File system that is associated with the received stream is not mounted. Cheers, mm Dňa 15. 7. 2011 14:30, Luke Marsden wrote / napísal(a): > Hi all, > > Having just quite extensively tested the v28 patchset contained within > http://mfsbsd.vx.sk/iso/mfsbsd-se-8.2-zfsv28-amd64.iso (updated > 19.06.2011) I wanted to share my experiences in the hope that the issues > I encountered can be fixed before 8.3 ;-) > > The biggest issue was a DEADLOCK which occurs quite reliably with a > given sequence of events in short succession, on a chroot filesystem > with many snapshots and a MySQL socket and nullfs mounts inside it: > > 1. Force unmount the nullfs mounts which are mounted on top of it > 2. Close the MySQL socket in /tmp > 3. Force unmount the actual filesystem (even if there are open FDs) > 4. 'zfs rename' the filesystem into our 'trash' filesystem (which I > understand consists of a clone, promote and destroy) > > The entire ZFS subsystem then hangs on any new I/O. > > Here is a procstat of the zfs rename process which hangs after the force > unmount: > > 25674 100871 zfs initial thread mi_switch+0x176 > sleepq_wait+0x42 _cv_wait+0x129 txg_wait_synced+0x85 > dsl_sync_task_group_wait+0x128 dsl_sync_task_do+0x54 dsl_dir_rename+0x8f > dsl_dataset_rename+0x272 zfsdev_ioctl+0xe6 devfs_ioctl_f+0x7b kern_ioctl > +0x102 ioctl+0xfd syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 > > Unfortunately it's not easy to reproduce, it only seems to happen in an > environment which is under load with a lot of datasets and a lot of zfs > operations happening concurrently on other datasets. I spent two days > trying to reproduce it in self-contained test environments but had no > luck, so I'm now reporting it anyway. > > There were two other issues which came up: > > 1. http://www.freebsd.org/cgi/query-pr.cgi?pr=157728 - we worked > around this with a semaphore on 'zfs list' and 'zfs recv' so > they never ran simultaneously. > 2. After an incremental receive, v28 seems to like to mount the > filesystem even if it was unmounted at the start of the receive. > (Notably, on previous versions of ZFS, this only happened for > non-incremental receives where the filesystem was being created > by the receive -- incremental receives correctly left the > filesystem in the mount state it started in). This plays very > badly when the filesystem then gets modified before we can force > unmount it (which we do immediately), because in this case the > next receive operation will fail with "filesystem has > modifications" - which we handle, but it's expensive to do so on > every incremental receive. > > I had a conversation with jhell on IRC about this and he had this to > say: > > <jhell> its happened twice before with ZFS basically a lock being held > and never free'd > <jhell> something there is happening between the snapshots and datasets > though. seems that it for some reason is able to destroy the dataset > before it destroys all the snapshots properly > <jhell> then tries to do the renaming of the snapshots and leads to a > lock not being free()'d or similar > > Maybe this can offer a hint for someone to go looking in the right > direction to solve this? > > Thank you for working on ZFS in FreeBSD! v15 is working very well for > us. > -- Martin Matuska FreeBSD committer http://blog.vx.sk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E20B8F3.5060603>