From owner-freebsd-bugs@FreeBSD.ORG Mon Aug 12 10:35:03 2013 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id F273940D; Mon, 12 Aug 2013 10:35:02 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) by mx1.freebsd.org (Postfix) with ESMTP id B347E2DE9; Mon, 12 Aug 2013 10:35:01 +0000 (UTC) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id 546DF5C57C; Mon, 12 Aug 2013 12:28:18 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id c1XDTlgd3lFJ; Mon, 12 Aug 2013 12:28:17 +0200 (CEST) Received: from mail.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 6A5E45C579; Mon, 12 Aug 2013 12:28:17 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.incore (Postfix) with ESMTP id 6225E508BF; Mon, 12 Aug 2013 12:28:17 +0200 (CEST) Message-ID: <5208B8C1.50905@incore.de> Date: Mon, 12 Aug 2013 12:28:17 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: bug-followup@freebsd.org, ahktenzero+freebsd@mohorovi.cc, freebsd-bugs@freebsd.org Subject: Re: kern/180060: [zfs] [panic] ZFS kernel panic, solaris assert on dsl_prop_unregister Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Aug 2013 10:35:03 -0000 In the meantime I did some more analysis of the problem and can explain why the panic happens. Two threads are involved in the problem, one runs the zfs command and wants to do a rollback using an ioctl() on /dev/zfs, the other runs mountd and tries to do an "unmount exports" with "export.ex_flags = MNT_DELEXPORT". Both threads are working on the same dataset ds=0xffffff0126912c00. The rollback thread wants to unregister his aclinherit property and panics, because this property does not exist in the list of properties anymore: (kgdb) f 12 #12 0xffffffff80cbd0b8 in zfs_unregister_callbacks (zfsvfs=0xffffff01f5664000) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1278 1278 VERIFY(dsl_prop_unregister(ds, "aclinherit", (kgdb) l 1273 zfsvfs) == 0); 1274 1275 VERIFY(dsl_prop_unregister(ds, "aclmode", acl_mode_changed_cb, 1276 zfsvfs) == 0); 1277 1278 VERIFY(dsl_prop_unregister(ds, "aclinherit", 1279 acl_inherit_changed_cb, zfsvfs) == 0); 1280 1281 VERIFY(dsl_prop_unregister(ds, "vscan", 1282 vscan_changed_cb, zfsvfs) == 0); The mountd thread wants to register some properties, but he first unregisters everything (see comment in the source): (kgdb) f 14 #14 0xffffffff80cc0048 in zfs_mount (vfsp=0xffffff03e49398d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1706 1706 error = zfs_register_callbacks(vfsp); (kgdb) list 1700 * When doing a remount, we simply refresh our temporary properties 1701 * according to those options set in the current VFS options. 1702 */ 1703 if (vfsp->vfs_flag & MS_REMOUNT) { 1704 /* refresh mount options */ 1705 zfs_unregister_callbacks(vfsp->vfs_data); 1706 error = zfs_register_callbacks(vfsp); 1707 goto out; 1708 } 1709 There is a little time gap between line 1705 and 1706 where the property aclinherit is not registered. If another thread tries to unregister this property during this gap he will panic. I don't know how to fix this proper. Simple to remove the VERIFY on dsl_prop_unregister() is easy, but I hope that one of the ZFS gurus will have a look at this and we will get a better solution. -- Andreas Longwitz