From owner-freebsd-fs@FreeBSD.ORG Wed Oct 8 21:20:04 2008 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48B25106568F for ; Wed, 8 Oct 2008 21:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2A47F8FC14 for ; Wed, 8 Oct 2008 21:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m98LK3S5090570 for ; Wed, 8 Oct 2008 21:20:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m98LK3fx090569; Wed, 8 Oct 2008 21:20:03 GMT (envelope-from gnats) Date: Wed, 8 Oct 2008 21:20:03 GMT Message-Id: <200810082120.m98LK3fx090569@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: "Weldon Godfrey" Cc: Subject: RE: kern/125149: [zfs][nfs] changing into .zfs dir from nfs clientcauses endless panic loop X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Weldon Godfrey List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Oct 2008 21:20:04 -0000 The following reply was made to PR kern/125149; it has been noted by GNATS. From: "Weldon Godfrey" To: "Jaakko Heinonen" , "Volker Werth" Cc: Subject: RE: kern/125149: [zfs][nfs] changing into .zfs dir from nfs clientcauses endless panic loop Date: Wed, 8 Oct 2008 16:06:50 -0500 Thanks! I will apply these patches tomorrow. Weldon -----Original Message----- From: Jaakko Heinonen [mailto:jh@saunalahti.fi]=20 Sent: Tuesday, October 07, 2008 10:37 AM To: Volker Werth Cc: Weldon Godfrey; bug-followup@freebsd.org Subject: Re: kern/125149: [zfs][nfs] changing into .zfs dir from nfs clientcauses endless panic loop Hi, On 2008-10-02, Volker Werth wrote: > > #8 0xffffffff804f06fa in vput (vp=3D0x0) at atomic.h:142 > > #9 0xffffffff8060670d in nfsrv_readdirplus (nfsd=3D0xffffff000584f100, > > slp=3D0xffffff0005725900,=20 > > td=3D0xffffff00059a0340, mrq=3D0xffffffffdf761af0) at > > /usr/src/sys/nfsserver/nfs_serv.c:3613 > > #10 0xffffffff80615a5d in nfssvc (td=3DVariable "td" is not = available. > > ) at /usr/src/sys/nfsserver/nfs_syscalls.c:461 > > #11 0xffffffff8072f377 in syscall (frame=3D0xffffffffdf761c70) at > > /usr/src/sys/amd64/amd64/trap.c:852 > > #12 0xffffffff807158bb in Xfast_syscall () at > > /usr/src/sys/amd64/amd64/exception.S:290 > > #13 0x000000080068746c in ?? () > > Previous frame inner to this frame (corrupt stack?) >=20 > I think the problem is the NULL pointer to vput. A maintainer needs to > check how nvp can get a NULL pointer (judging by assuming my fresh > codebase is not too different from yours). The bug is reproducible with nfs clients using readdirplus. FreeBSD client doesn't use readdirplus by default but you can enable it with -l mount option. Here are steps to reproduce the panic with FreeBSD nfs client: - nfs export a zfs file system - on client mount the file system with -l mount option and list the zfs control directory # mount_nfs -l x.x.x.x:/tank /mnt # ls /mnt/.zfs I see two bugs here: 1) nfsrv_readdirplus() doesn't check VFS_VGET() error status properly. It only checks for EOPNOTSUPP but other errors are ignored. This is the final reason for the panic and in theory it could happen for other file systems too. In this case VFS_VGET() returns EINVAL and results NULL nvp. 2) zfs VFS_VGET() returns EINVAL for .zfs control directory entries. Looking at zfs_vget() it tries find corresponding znode to fulfill the request. However control directory entries don't have backing znodes. Here is a patch which fixes 1). The patch prevents system from panicing but a fix for 2) is needed to make readdirplus work with .zfs directory. %%% Index: sys/nfsserver/nfs_serv.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/nfsserver/nfs_serv.c (revision 183511) +++ sys/nfsserver/nfs_serv.c (working copy) @@ -3597,9 +3597,12 @@ again: * Probe one of the directory entries to see if the filesystem * supports VGET. */ - if (VFS_VGET(vp->v_mount, dp->d_fileno, LK_EXCLUSIVE, &nvp) =3D=3D - EOPNOTSUPP) { - error =3D NFSERR_NOTSUPP; + error =3D VFS_VGET(vp->v_mount, dp->d_fileno, LK_EXCLUSIVE, &nvp); + if (error) { + if (error =3D=3D EOPNOTSUPP) + error =3D NFSERR_NOTSUPP; + else + error =3D NFSERR_SERVERFAULT; vrele(vp); vp =3D NULL; free((caddr_t)cookies, M_TEMP); %%% And here's an attempt to add support for .zfs control directory entries (bug 2)) in zfs_vget(). The patch is very experimental and it only works for snapshots which are already active (mounted). %%% Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c (revision 183587) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c (working copy) @@ -759,9 +759,10 @@ zfs_vget(vfs_t *vfsp, ino_t ino, int fla VN_RELE(ZTOV(zp)); err =3D EINVAL; } - if (err !=3D 0) - *vpp =3D NULL; - else { + if (err !=3D 0) { + /* try .zfs control directory */ + err =3D zfsctl_vget(vfsp, ino, flags, vpp); + } else { *vpp =3D ZTOV(zp); vn_lock(*vpp, flags); } Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c (revision 183587) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c (working copy) @@ -1047,6 +1047,63 @@ zfsctl_lookup_objset(vfs_t *vfsp, uint64 return (error); } =20 +int +zfsctl_vget(vfs_t *vfsp, uint64_t nodeid, int flags, vnode_t **vpp) +{ + zfsvfs_t *zfsvfs =3D vfsp->vfs_data; + vnode_t *dvp, *vp; + zfsctl_snapdir_t *sdp; + zfsctl_node_t *zcp; + zfs_snapentry_t *sep; + int error; + + *vpp =3D NULL; + + ASSERT(zfsvfs->z_ctldir !=3D NULL); + error =3D zfsctl_root_lookup(zfsvfs->z_ctldir, "snapshot", &dvp, + NULL, 0, NULL, kcred); + if (error !=3D 0) + return (error); + + if (nodeid =3D=3D ZFSCTL_INO_ROOT || nodeid =3D=3D ZFSCTL_INO_SNAPDIR) = { + if (nodeid =3D=3D ZFSCTL_INO_SNAPDIR) + *vpp =3D dvp; + else { + VN_RELE(dvp); + *vpp =3D zfsvfs->z_ctldir; + VN_HOLD(*vpp); + } + /* XXX: LK_RETRY? */ + vn_lock(*vpp, flags | LK_RETRY); + return (0); + } + =09 + sdp =3D dvp->v_data; + + mutex_enter(&sdp->sd_lock); + sep =3D avl_first(&sdp->sd_snaps); + while (sep !=3D NULL) { + vp =3D sep->se_root; + zcp =3D vp->v_data; + if (zcp->zc_id =3D=3D nodeid) + break; + + sep =3D AVL_NEXT(&sdp->sd_snaps, sep); + } + + if (sep !=3D NULL) { + VN_HOLD(vp); + *vpp =3D vp; + vn_lock(*vpp, flags); + } else + error =3D EINVAL; + + mutex_exit(&sdp->sd_lock); + + VN_RELE(dvp); + + return (error); +} /* * Unmount any snapshots for the given filesystem. This is called from * zfs_umount() - if we have a ctldir, then go through and unmount all the Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ctldir.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ctldir.h (revision 183587) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ctldir.h (working copy) @@ -60,6 +60,7 @@ int zfsctl_root_lookup(vnode_t *dvp, cha int flags, vnode_t *rdir, cred_t *cr); =20 int zfsctl_lookup_objset(vfs_t *vfsp, uint64_t objsetid, zfsvfs_t **zfsvfsp); +int zfsctl_vget(vfs_t *vfsp, uint64_t nodeid, int flags, vnode_t **vpp); =20 #define ZFSCTL_INO_ROOT 0x1 #define ZFSCTL_INO_SNAPDIR 0x2 %%% --=20 Jaakko