From owner-freebsd-fs@FreeBSD.ORG Wed Jun 19 14:40:01 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9CA92DD0 for ; Wed, 19 Jun 2013 14:40:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 7201016F6 for ; Wed, 19 Jun 2013 14:40:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5JEe1to031063 for ; Wed, 19 Jun 2013 14:40:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5JEe1fN031062; Wed, 19 Jun 2013 14:40:01 GMT (envelope-from gnats) Date: Wed, 19 Jun 2013 14:40:01 GMT Message-Id: <201306191440.r5JEe1fN031062@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: "Steven Hartland" Subject: Re: kern/161968: [zfs] [hang] renaming snapshot with -r including a zvol snapshot causes total ZFS freeze/lockup X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Steven Hartland List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Jun 2013 14:40:01 -0000 The following reply was made to PR kern/161968; it has been noted by GNATS. From: "Steven Hartland" To: , "Peter Maloney" Cc: Subject: Re: kern/161968: [zfs] [hang] renaming snapshot with -r including a zvol snapshot causes total ZFS freeze/lockup Date: Wed, 19 Jun 2013 15:23:13 +0100 I've reproduced this here, the cause is a live lock between zvols geom actions and ZFS itself between the two locks: db> show sleepchain thread 100553 (pid 6, txg_thread_enter) blocked on sx "spa_namespace_lock" XLOCK thread 100054 (pid 2, g_event) blocked on sx "dp->dp_config_rwlock" XLOCK db> Tracing pid 2 tid 100054 td 0xffffff001c1d4470 sched_switch() at sched_switch+0x153 mi_switch() at mi_switch+0x1f8 sleepq_switch() at sleepq_switch+0x123 sleepq_wait() at sleepq_wait+0x4d _sx_slock_hard() at _sx_slock_hard+0x1e2 _sx_slock() at _sx_slock+0xc9 dsl_dir_open_spa() at dsl_dir_open_spa+0xab dsl_dataset_hold() at dsl_dataset_hold+0x3b dsl_dataset_own() at dsl_dataset_own+0x2f dmu_objset_own() at dmu_objset_own+0x36 zvol_first_open() at zvol_first_open+0x34 zvol_geom_access() at zvol_geom_access+0x2df g_access() at g_access+0x1ba g_part_taste() at g_part_taste+0xc4 g_new_provider_event() at g_new_provider_event+0xaa g_run_events() at g_run_events+0x250 fork_exit() at fork_exit+0x135 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff92070a2bb0, rbp = 0 --- db> bt 100553 Tracing pid 6 tid 100553 td 0xffffff002c2308e0 sched_switch() at sched_switch+0x153 mi_switch() at mi_switch+0x1f8 sleepq_switch() at sleepq_switch+0x123 sleepq_wait() at sleepq_wait+0x4d _sx_xlock_hard() at _sx_xlock_hard+0x296 _sx_xlock() at _sx_xlock+0xb7 zvol_rename_minors() at zvol_rename_minors+0x75 dsl_dataset_snapshot_rename_sync() at dsl_dataset_snapshot_rename_sync+0x141 dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x14e dsl_pool_sync() at dsl_pool_sync+0x47d spa_sync() at spa_sync+0x34a txg_sync_thread() at txg_sync_thread+0x139 fork_exit() at fork_exit+0x135 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff920e61abb0, rbp = 0 --- The following steps recreate the issue on stable/8 r251496 gpart create -s GPT da3 gpart add -t freebsd-zfs da3 zpool create -f testpool da3p1 zfs create -V 150m testpool/testvol zfs snapshot -r testpool@snap zfs rename -r testpool@snap testpool@snap-new I've been unable to reproduce on current r251471. I'm not sure is this is due to a timing issue due to the significant changes in ZFS sync tasks in current or if the issue really doesn't exist any more. Regards Steve