Date: Tue, 15 Sep 2009 11:13:40 +0000 (UTC) From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r197215 - in stable/8: . cddl/compat/opensolaris cddl/compat/opensolaris/include cddl/contrib/opensolaris cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/head cddl/contrib/ope... Message-ID: <200909151113.n8FBDeZ1086175@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: pjd Date: Tue Sep 15 11:13:40 2009 New Revision: 197215 URL: http://svn.freebsd.org/changeset/base/197215 Log: MFC r196456,r196457,r196458,r196662,r196702,r196703,r196919,r196927,r196928, r196943,r196944,r196947,r196950,r196953,r196954,r196965,r196978,r196979, r196980,r196982,r196985,r196992,r197131,r197133,r197150,r197151,r197152, r197153,r197167,r197172,r197177,r197200,r197201: r196456: - Give minclsyspri and maxclsyspri real values (consulted with kmacy). - Honour 'pri' argument for thread_create(). r196457: Set priority of vdev_geom threads and zvol threads to PRIBIO. r196458: - Hide ZFS kernel threads under zfskern process. - Use better (shorter) threads names: 'zvol:worker zvol/tank/vol00' -> 'zvol tank/vol00' 'vdev:worker da0' -> 'vdev da0' r196662: Add missing mountpoint vnode locking. This fixes panic on assertion with DEBUG_VFS_LOCKS and vfs.usermount=1 when regular user tries to mount dataset owned by him. r196702: Remove empty directory. r196703: Backport the 'dirtying dbuf' panic fix from newer ZFS version. Reported by: Thomas Backman <serenity@exscape.org> r196919: bzero() on-stack argument, so mutex_init() won't misinterpret that the lock is already initialized if we have some garbage on the stack. PR: kern/135480 Reported by: Emil Mikulic <emikulic@gmail.com> r196927: Changing provider size is not really supported by GEOM, but doing so when provider is closed should be ok. When administrator requests to change ZVOL size do it immediately if ZVOL is closed or do it on last ZVOL close. PR: kern/136942 Requested by: Bernard Buri <bsd@ask-us.at> r196928: Teach zdb(8) how to obtain GEOM provider size. PR: kern/133134 Reported by: Philipp Wuensche <cryx-freebsd@h3q.com> r196943: - Avoid holding mutex around M_WAITOK allocations. - Add locking for mnt_opt field. r196944: Don't recheck ownership on update mount. This will eliminate LOR between vfs_busy() and mount mutex. We check ownership in vfs_domount() anyway. Noticed by: kib Reviewed by: kib r196947: Defer thread start until we set priority. Reviewed by: kib r196950: Fix detection of file system being shared. Now zfs unshare/destroy/rename command will properly remove exported file systems. r196953: When snapshot mount point is busy (for example we are still in it) we will fail to unmount it, but it won't be removed from the tree, so in that case there is no need to reinsert it. Reported by: trasz r196954: If we have to use avl_find(), optimize a bit and use avl_insert() instead of avl_add() (the latter is actually a wrapper around avl_find() + avl_insert()). Fix similar case in the code that is currently commented out. r196965: Fix reference count leak for a case where snapshot's mount point is updated. r196978: Call ZFS_EXIT() after locking the vnode. r196979: On FreeBSD we don't have to look for snapshot's mount point, because fhtovp method is already called with proper mount point. r196980: When we automatically mount snapshot we want to return vnode of the mount point from the lookup and not covered vnode. This is one of the fixes for using .zfs/ over NFS. r196982: We don't export individual snapshots, so mnt_export field in snapshot's mount point is NULL. That's why when we try to access snapshots over NFS use mnt_export field from the parent file system. r196985: Only log successful commands! Without this fix we log even unsuccessful commands executed by unprivileged users. Action is not really taken, but it is logged to pool history, which might be confusing. Reported by: Denis Ahrens <denis@h3q.com> r196992: Implement __assert() for Solaris-specific code. Until now Solaris code was using Solaris prototype for __assert(), but FreeBSD's implementation. Both take different arguments, so we were either core-dumping in assert() or printing garbage. Reported by: avg r197131: Tighten up the check for race in zfs_zget() - ZTOV(zp) can not only contain NULL, but also can point to dead vnode, take that into account. PR: kern/132068 Reported by: Edward Fisk <7ogcg7g02@sneakemail.com>, kris Fix based on patch from: Jaakko Heinonen <jh@saunalahti.fi> r197133: - Protect reclaim with z_teardown_inactive_lock. - Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if z_dbuf field is NULL - this might happen in case of rollback or forced unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete(). - On forced unmount wait for all znodes to be destroyed - destruction can be done asynchronously via zfs_reclaim_complete(). r197150: There is a bug where mze_insert() can trigger an assert() of inserting the same entry twice. This bug is not fixed yet, but leads to situation where when try to access corrupted directory the kernel will panic. Until the bug is properly fixed, try to recover from it and log that it happened. Reported by: marck OpenSolaris bug: 6709336 r197151: Be sure not to overflow struct fid. r197152: Extend scope of the z_teardown_lock lock for consistency and "just in case". r197153: When zfs.ko is compiled with debug, make sure that znode and vnode point at each other. r197167: Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories by just returning EOPNOTSUPP. This will allow NFS server to fall back to regular READDIR. Note that converting inode number to snapshot's vnode is expensive operation. Snapshots are stored in AVL tree, but based on their names, not inode numbers, so to convert inode to snapshot vnode we have to interate over all snalshots. This is not a problem in OpenSolaris, because in their READDIRPLUS implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on d_fileno as we do. PR: kern/125149 Reported by: Weldon Godfrey <wgodfrey@ena.com> Analysis by: Jaakko Heinonen <jh@saunalahti.fi> r197172: Add missing \n. Reported by: marck r197177: Support both case: when snapshot is already mounted and when it is not yet mounted. r197200: Modify mount(8) to skip MNT_IGNORE file systems by default, just like df(1) does. This is not POLA violation, because there is no single file system in the base that use MNT_IGNORE currently, although ZFS snapshots will be mounted with MNT_IGNORE after next commit. Reviewed by: kib r197201: - Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows ZFS route of not listing snapshots by default with 'zfs list' command. - Add UPDATING entry to note that ZFS snapshots are no longer visible in mount(8) and df(1) output by default. Reviewed by: kib Approved by: re (bz) Added: stable/8/cddl/compat/opensolaris/include/assert.h - copied unchanged from r196992, head/cddl/compat/opensolaris/include/assert.h Deleted: stable/8/cddl/contrib/opensolaris/head/assert.h stable/8/sys/cddl/contrib/opensolaris/uts/common/rpc/ Modified: stable/8/UPDATING stable/8/cddl/compat/opensolaris/ (props changed) stable/8/cddl/contrib/opensolaris/ (props changed) stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c stable/8/sbin/mount/ (props changed) stable/8/sbin/mount/mount.8 stable/8/sbin/mount/mount.c stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c stable/8/sys/cddl/compat/opensolaris/sys/mutex.h stable/8/sys/cddl/compat/opensolaris/sys/proc.h stable/8/sys/cddl/compat/opensolaris/sys/vfs.h stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/callb.h stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) stable/8/sys/dev/xen/xenpci/ (props changed) Modified: stable/8/UPDATING ============================================================================== --- stable/8/UPDATING Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/UPDATING Tue Sep 15 11:13:40 2009 (r197215) @@ -22,6 +22,10 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 8. to maximize performance. (To disable malloc debugging, run ln -s aj /etc/malloc.conf.) +20090915: + ZFS snapshots are now mounted with MNT_IGNORE flag. Use -v option for + mount(8) and -a option for df(1) to see them. + 20090813: Remove the option STOP_NMI. The default action is now to use NMI only for KDB via the newly introduced function stop_cpus_hard() Copied: stable/8/cddl/compat/opensolaris/include/assert.h (from r196992, head/cddl/compat/opensolaris/include/assert.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/8/cddl/compat/opensolaris/include/assert.h Tue Sep 15 11:13:40 2009 (r197215, copy of r196992, head/cddl/compat/opensolaris/include/assert.h) @@ -0,0 +1,55 @@ +/*- + * Copyright (c) 2009 Pawel Jakub Dawidek <pjd@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#undef assert +#undef _assert + +#ifdef NDEBUG +#define assert(e) ((void)0) +#define _assert(e) ((void)0) +#else +#define _assert(e) assert(e) + +#define assert(e) ((e) ? (void)0 : __assert(#e, __FILE__, __LINE__)) +#endif /* NDEBUG */ + +#ifndef _ASSERT_H_ +#define _ASSERT_H_ +#include <stdio.h> +#include <stdlib.h> + +static __inline void +__assert(const char *expr, const char *file, int line) +{ + + (void)fprintf(stderr, "Assertion failed: (%s), file %s, line %d.\n", + expr, file, line); + abort(); + /* NOTREACHED */ +} +#endif /* !_ASSERT_H_ */ Modified: stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c Tue Sep 15 11:13:40 2009 (r197215) @@ -1322,6 +1322,14 @@ dump_label(const char *dev) exit(1); } + if (S_ISCHR(statbuf.st_mode)) { + if (ioctl(fd, DIOCGMEDIASIZE, &statbuf.st_size) == -1) { + (void) printf("failed to get size of '%s': %s\n", dev, + strerror(errno)); + exit(1); + } + } + psize = statbuf.st_size; psize = P2ALIGN(psize, (uint64_t)sizeof (vdev_label_t)); Modified: stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c Tue Sep 15 11:13:40 2009 (r197215) @@ -172,6 +172,7 @@ is_shared(libzfs_handle_t *hdl, const ch *tab = '\0'; if (strcmp(buf, mountpoint) == 0) { +#if defined(sun) /* * the protocol field is the third field * skip over second field @@ -194,6 +195,10 @@ is_shared(libzfs_handle_t *hdl, const ch return (0); } } +#else + if (proto == PROTO_NFS) + return (SHARED_NFS); +#endif } } Modified: stable/8/sbin/mount/mount.8 ============================================================================== --- stable/8/sbin/mount/mount.8 Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sbin/mount/mount.8 Tue Sep 15 11:13:40 2009 (r197215) @@ -469,6 +469,12 @@ or option. .It Fl v Verbose mode. +If the +.Fl v +is used alone, show all file systems, including those that were mounted with the +.Dv MNT_IGNORE +flag and show additional information about each file system (including fsid +when run by root). .It Fl w The file system object is to be read and write. .El Modified: stable/8/sbin/mount/mount.c ============================================================================== --- stable/8/sbin/mount/mount.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sbin/mount/mount.c Tue Sep 15 11:13:40 2009 (r197215) @@ -348,6 +348,9 @@ main(int argc, char *argv[]) if (checkvfsname(mntbuf[i].f_fstypename, vfslist)) continue; + if (!verbose && + (mntbuf[i].f_flags & MNT_IGNORE) != 0) + continue; prmount(&mntbuf[i]); } } Modified: stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c Tue Sep 15 11:13:40 2009 (r197215) @@ -45,20 +45,33 @@ vfs_setmntopt(vfs_t *vfsp, const char *n { struct vfsopt *opt; size_t namesize; + int locked; + + if (!(locked = mtx_owned(MNT_MTX(vfsp)))) + MNT_ILOCK(vfsp); if (vfsp->mnt_opt == NULL) { - vfsp->mnt_opt = malloc(sizeof(*vfsp->mnt_opt), M_MOUNT, M_WAITOK); - TAILQ_INIT(vfsp->mnt_opt); + void *opts; + + MNT_IUNLOCK(vfsp); + opts = malloc(sizeof(*vfsp->mnt_opt), M_MOUNT, M_WAITOK); + MNT_ILOCK(vfsp); + if (vfsp->mnt_opt == NULL) { + vfsp->mnt_opt = opts; + TAILQ_INIT(vfsp->mnt_opt); + } else { + free(opts, M_MOUNT); + } } - opt = malloc(sizeof(*opt), M_MOUNT, M_WAITOK); + MNT_IUNLOCK(vfsp); + opt = malloc(sizeof(*opt), M_MOUNT, M_WAITOK); namesize = strlen(name) + 1; opt->name = malloc(namesize, M_MOUNT, M_WAITOK); strlcpy(opt->name, name, namesize); opt->pos = -1; opt->seen = 1; - if (arg == NULL) { opt->value = NULL; opt->len = 0; @@ -67,16 +80,23 @@ vfs_setmntopt(vfs_t *vfsp, const char *n opt->value = malloc(opt->len, M_MOUNT, M_WAITOK); bcopy(arg, opt->value, opt->len); } - /* TODO: Locking. */ + + MNT_ILOCK(vfsp); TAILQ_INSERT_TAIL(vfsp->mnt_opt, opt, link); + if (!locked) + MNT_IUNLOCK(vfsp); } void vfs_clearmntopt(vfs_t *vfsp, const char *name) { + int locked; - /* TODO: Locking. */ + if (!(locked = mtx_owned(MNT_MTX(vfsp)))) + MNT_ILOCK(vfsp); vfs_deleteopt(vfsp->mnt_opt, name); + if (!locked) + MNT_IUNLOCK(vfsp); } int @@ -92,12 +112,13 @@ vfs_optionisset(const vfs_t *vfsp, const } int -domount(kthread_t *td, vnode_t *vp, const char *fstype, char *fspath, +mount_snapshot(kthread_t *td, vnode_t **vpp, const char *fstype, char *fspath, char *fspec, int fsflags) { struct mount *mp; struct vfsconf *vfsp; struct ucred *cr; + vnode_t *vp; int error; /* @@ -112,23 +133,28 @@ domount(kthread_t *td, vnode_t *vp, cons if (vfsp == NULL) return (ENODEV); + vp = *vpp; if (vp->v_type != VDIR) return (ENOTDIR); + /* + * We need vnode lock to protect v_mountedhere and vnode interlock + * to protect v_iflag. + */ + vn_lock(vp, LK_SHARED | LK_RETRY); VI_LOCK(vp); - if ((vp->v_iflag & VI_MOUNT) != 0 || - vp->v_mountedhere != NULL) { + if ((vp->v_iflag & VI_MOUNT) != 0 || vp->v_mountedhere != NULL) { VI_UNLOCK(vp); + VOP_UNLOCK(vp, 0); return (EBUSY); } vp->v_iflag |= VI_MOUNT; VI_UNLOCK(vp); + VOP_UNLOCK(vp, 0); /* * Allocate and initialize the filesystem. */ - vn_lock(vp, LK_SHARED | LK_RETRY); mp = vfs_mount_alloc(vp, vfsp, fspath, td->td_ucred); - VOP_UNLOCK(vp, 0); mp->mnt_optnew = NULL; vfs_setmntopt(mp, "from", fspec, 0); @@ -138,11 +164,18 @@ domount(kthread_t *td, vnode_t *vp, cons /* * Set the mount level flags. */ - if (fsflags & MNT_RDONLY) - mp->mnt_flag |= MNT_RDONLY; - mp->mnt_flag &=~ MNT_UPDATEMASK; + mp->mnt_flag &= ~MNT_UPDATEMASK; mp->mnt_flag |= fsflags & (MNT_UPDATEMASK | MNT_FORCE | MNT_ROOTFS); /* + * Snapshots are always read-only. + */ + mp->mnt_flag |= MNT_RDONLY; + /* + * We don't want snapshots to be visible in regular + * mount(8) and df(1) output. + */ + mp->mnt_flag |= MNT_IGNORE; + /* * Unprivileged user can trigger mounting a snapshot, but we don't want * him to unmount it, so we switch to privileged of original mount. */ @@ -150,11 +183,6 @@ domount(kthread_t *td, vnode_t *vp, cons mp->mnt_cred = crdup(vp->v_mount->mnt_cred); mp->mnt_stat.f_owner = mp->mnt_cred->cr_uid; /* - * Mount the filesystem. - * XXX The final recipients of VFS_MOUNT just overwrite the ndp they - * get. No freeing of cn_pnbuf. - */ - /* * XXX: This is evil, but we can't mount a snapshot as a regular user. * XXX: Is is safe when snapshot is mounted from within a jail? */ @@ -163,7 +191,7 @@ domount(kthread_t *td, vnode_t *vp, cons error = VFS_MOUNT(mp); td->td_ucred = cr; - if (!error) { + if (error == 0) { if (mp->mnt_opt != NULL) vfs_freeopts(mp->mnt_opt); mp->mnt_opt = mp->mnt_optnew; @@ -175,42 +203,33 @@ domount(kthread_t *td, vnode_t *vp, cons */ mp->mnt_optnew = NULL; vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); - /* - * Put the new filesystem on the mount list after root. - */ #ifdef FREEBSD_NAMECACHE cache_purge(vp); #endif - if (!error) { + VI_LOCK(vp); + vp->v_iflag &= ~VI_MOUNT; + VI_UNLOCK(vp); + if (error == 0) { vnode_t *mvp; - VI_LOCK(vp); - vp->v_iflag &= ~VI_MOUNT; - VI_UNLOCK(vp); vp->v_mountedhere = mp; + /* + * Put the new filesystem on the mount list. + */ mtx_lock(&mountlist_mtx); TAILQ_INSERT_TAIL(&mountlist, mp, mnt_list); mtx_unlock(&mountlist_mtx); vfs_event_signal(NULL, VQ_MOUNT, 0); if (VFS_ROOT(mp, LK_EXCLUSIVE, &mvp)) panic("mount: lost mount"); - mountcheckdirs(vp, mvp); - vput(mvp); - VOP_UNLOCK(vp, 0); - if ((mp->mnt_flag & MNT_RDONLY) == 0) - error = vfs_allocate_syncvnode(mp); + vput(vp); vfs_unbusy(mp); - if (error) - vrele(vp); - else - vfs_mountedfrom(mp, fspec); + *vpp = mvp; } else { - VI_LOCK(vp); - vp->v_iflag &= ~VI_MOUNT; - VI_UNLOCK(vp); - VOP_UNLOCK(vp, 0); + vput(vp); vfs_unbusy(mp); vfs_mount_destroy(mp); + *vpp = NULL; } return (error); } Modified: stable/8/sys/cddl/compat/opensolaris/sys/mutex.h ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/sys/mutex.h Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/compat/opensolaris/sys/mutex.h Tue Sep 15 11:13:40 2009 (r197215) @@ -32,9 +32,9 @@ #ifdef _KERNEL #include <sys/param.h> -#include <sys/proc.h> #include <sys/lock.h> #include_next <sys/mutex.h> +#include <sys/proc.h> #include <sys/sx.h> typedef enum { Modified: stable/8/sys/cddl/compat/opensolaris/sys/proc.h ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/sys/proc.h Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/compat/opensolaris/sys/proc.h Tue Sep 15 11:13:40 2009 (r197215) @@ -34,13 +34,17 @@ #include_next <sys/proc.h> #include <sys/stdint.h> #include <sys/smp.h> +#include <sys/sched.h> +#include <sys/lock.h> +#include <sys/mutex.h> +#include <sys/unistd.h> #include <sys/debug.h> #ifdef _KERNEL #define CPU curcpu -#define minclsyspri 0 -#define maxclsyspri 0 +#define minclsyspri PRIBIO +#define maxclsyspri PVM #define max_ncpus mp_ncpus #define boot_max_ncpus mp_ncpus @@ -54,11 +58,13 @@ typedef struct thread kthread_t; typedef struct thread *kthread_id_t; typedef struct proc proc_t; +extern struct proc *zfsproc; + static __inline kthread_t * thread_create(caddr_t stk, size_t stksize, void (*proc)(void *), void *arg, size_t len, proc_t *pp, int state, pri_t pri) { - proc_t *p; + kthread_t *td = NULL; int error; /* @@ -67,13 +73,20 @@ thread_create(caddr_t stk, size_t stksiz ASSERT(stk == NULL); ASSERT(len == 0); ASSERT(state == TS_RUN); + ASSERT(pp == &p0); - error = kproc_create(proc, arg, &p, 0, stksize / PAGE_SIZE, - "solthread %p", proc); - return (error == 0 ? FIRST_THREAD_IN_PROC(p) : NULL); + error = kproc_kthread_add(proc, arg, &zfsproc, &td, RFSTOPPED, + stksize / PAGE_SIZE, "zfskern", "solthread %p", proc); + if (error == 0) { + thread_lock(td); + sched_prio(td, pri); + sched_add(td, SRQ_BORING); + thread_unlock(td); + } + return (td); } -#define thread_exit() kproc_exit(0) +#define thread_exit() kthread_exit() #endif /* _KERNEL */ Modified: stable/8/sys/cddl/compat/opensolaris/sys/vfs.h ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/sys/vfs.h Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/compat/opensolaris/sys/vfs.h Tue Sep 15 11:13:40 2009 (r197215) @@ -110,8 +110,8 @@ void vfs_setmntopt(vfs_t *vfsp, const ch int flags __unused); void vfs_clearmntopt(vfs_t *vfsp, const char *name); int vfs_optionisset(const vfs_t *vfsp, const char *opt, char **argp); -int domount(kthread_t *td, vnode_t *vp, const char *fstype, char *fspath, - char *fspec, int fsflags); +int mount_snapshot(kthread_t *td, vnode_t **vpp, const char *fstype, + char *fspath, char *fspec, int fsflags); typedef uint64_t vfs_feature_t; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Tue Sep 15 11:13:40 2009 (r197215) @@ -19,7 +19,7 @@ * CDDL HEADER END */ /* - * Copyright 2008 Sun Microsystems, Inc. All rights reserved. + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. * Use is subject to license terms. */ @@ -864,10 +864,11 @@ restore_object(struct restorearg *ra, ob /* currently allocated, want to be allocated */ dmu_tx_hold_bonus(tx, drro->drr_object); /* - * We may change blocksize, so need to - * hold_write + * We may change blocksize and delete old content, + * so need to hold_write and hold_free. */ dmu_tx_hold_write(tx, drro->drr_object, 0, 1); + dmu_tx_hold_free(tx, drro->drr_object, 0, DMU_OBJECT_END); err = dmu_tx_assign(tx, TXG_WAIT); if (err) { dmu_tx_abort(tx); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Tue Sep 15 11:13:40 2009 (r197215) @@ -415,7 +415,7 @@ void dnode_reallocate(dnode_t *dn, dmu_object_type_t ot, int blocksize, dmu_object_type_t bonustype, int bonuslen, dmu_tx_t *tx) { - int i, old_nblkptr; + int i, nblkptr; dmu_buf_impl_t *db = NULL; ASSERT3U(blocksize, >=, SPA_MINBLOCKSIZE); @@ -445,6 +445,8 @@ dnode_reallocate(dnode_t *dn, dmu_object dnode_free_range(dn, 0, -1ULL, tx); } + nblkptr = 1 + ((DN_MAX_BONUSLEN - bonuslen) >> SPA_BLKPTRSHIFT); + /* change blocksize */ rw_enter(&dn->dn_struct_rwlock, RW_WRITER); if (blocksize != dn->dn_datablksz && @@ -457,6 +459,8 @@ dnode_reallocate(dnode_t *dn, dmu_object dnode_setdirty(dn, tx); dn->dn_next_bonuslen[tx->tx_txg&TXG_MASK] = bonuslen; dn->dn_next_blksz[tx->tx_txg&TXG_MASK] = blocksize; + if (dn->dn_nblkptr != nblkptr) + dn->dn_next_nblkptr[tx->tx_txg&TXG_MASK] = nblkptr; rw_exit(&dn->dn_struct_rwlock); if (db) dbuf_rele(db, FTAG); @@ -466,19 +470,15 @@ dnode_reallocate(dnode_t *dn, dmu_object /* change bonus size and type */ mutex_enter(&dn->dn_mtx); - old_nblkptr = dn->dn_nblkptr; dn->dn_bonustype = bonustype; dn->dn_bonuslen = bonuslen; - dn->dn_nblkptr = 1 + ((DN_MAX_BONUSLEN - bonuslen) >> SPA_BLKPTRSHIFT); + dn->dn_nblkptr = nblkptr; dn->dn_checksum = ZIO_CHECKSUM_INHERIT; dn->dn_compress = ZIO_COMPRESS_INHERIT; ASSERT3U(dn->dn_nblkptr, <=, DN_MAX_NBLKPTR); - /* XXX - for now, we can't make nblkptr smaller */ - ASSERT3U(dn->dn_nblkptr, >=, old_nblkptr); - - /* fix up the bonus db_size if dn_nblkptr has changed */ - if (dn->dn_bonus && dn->dn_bonuslen != old_nblkptr) { + /* fix up the bonus db_size */ + if (dn->dn_bonus) { dn->dn_bonus->db.db_size = DN_MAX_BONUSLEN - (dn->dn_nblkptr-1) * sizeof (blkptr_t); ASSERT(dn->dn_bonuslen <= dn->dn_bonus->db.db_size); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Tue Sep 15 11:13:40 2009 (r197215) @@ -19,12 +19,10 @@ * CDDL HEADER END */ /* - * Copyright 2008 Sun Microsystems, Inc. All rights reserved. + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. * Use is subject to license terms. */ -#pragma ident "%Z%%M% %I% %E% SMI" - #include <sys/zfs_context.h> #include <sys/dbuf.h> #include <sys/dnode.h> @@ -534,18 +532,12 @@ dnode_sync(dnode_t *dn, dmu_tx_t *tx) /* XXX shouldn't the phys already be zeroed? */ bzero(dnp, DNODE_CORE_SIZE); dnp->dn_nlevels = 1; + dnp->dn_nblkptr = dn->dn_nblkptr; } - if (dn->dn_nblkptr > dnp->dn_nblkptr) { - /* zero the new blkptrs we are gaining */ - bzero(dnp->dn_blkptr + dnp->dn_nblkptr, - sizeof (blkptr_t) * - (dn->dn_nblkptr - dnp->dn_nblkptr)); - } dnp->dn_type = dn->dn_type; dnp->dn_bonustype = dn->dn_bonustype; dnp->dn_bonuslen = dn->dn_bonuslen; - dnp->dn_nblkptr = dn->dn_nblkptr; } ASSERT(dnp->dn_nlevels > 1 || @@ -605,6 +597,30 @@ dnode_sync(dnode_t *dn, dmu_tx_t *tx) return; } + if (dn->dn_next_nblkptr[txgoff]) { + /* this should only happen on a realloc */ + ASSERT(dn->dn_allocated_txg == tx->tx_txg); + if (dn->dn_next_nblkptr[txgoff] > dnp->dn_nblkptr) { + /* zero the new blkptrs we are gaining */ + bzero(dnp->dn_blkptr + dnp->dn_nblkptr, + sizeof (blkptr_t) * + (dn->dn_next_nblkptr[txgoff] - dnp->dn_nblkptr)); +#ifdef ZFS_DEBUG + } else { + int i; + ASSERT(dn->dn_next_nblkptr[txgoff] < dnp->dn_nblkptr); + /* the blkptrs we are losing better be unallocated */ + for (i = dn->dn_next_nblkptr[txgoff]; + i < dnp->dn_nblkptr; i++) + ASSERT(BP_IS_HOLE(&dnp->dn_blkptr[i])); +#endif + } + mutex_enter(&dn->dn_mtx); + dnp->dn_nblkptr = dn->dn_next_nblkptr[txgoff]; + dn->dn_next_nblkptr[txgoff] = 0; + mutex_exit(&dn->dn_mtx); + } + if (dn->dn_next_nlevels[txgoff]) { dnode_increase_indirection(dn, tx); dn->dn_next_nlevels[txgoff] = 0; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Tue Sep 15 11:13:40 2009 (r197215) @@ -1419,6 +1419,7 @@ dsl_dataset_drain_refs(dsl_dataset_t *ds { struct refsarg arg; + bzero(&arg, sizeof(arg)); mutex_init(&arg.lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&arg.cv, NULL, CV_DEFAULT, NULL); arg.gone = FALSE; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Tue Sep 15 11:13:40 2009 (r197215) @@ -19,7 +19,7 @@ * CDDL HEADER END */ /* - * Copyright 2008 Sun Microsystems, Inc. All rights reserved. + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. * Use is subject to license terms. */ @@ -160,6 +160,7 @@ typedef struct dnode { uint16_t dn_datablkszsec; /* in 512b sectors */ uint32_t dn_datablksz; /* in bytes */ uint64_t dn_maxblkid; + uint8_t dn_next_nblkptr[TXG_SIZE]; uint8_t dn_next_nlevels[TXG_SIZE]; uint8_t dn_next_indblkshift[TXG_SIZE]; uint16_t dn_next_bonuslen[TXG_SIZE]; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h Tue Sep 15 11:13:40 2009 (r197215) @@ -231,8 +231,27 @@ typedef struct znode { /* * Convert between znode pointers and vnode pointers */ +#ifdef DEBUG +static __inline vnode_t * +ZTOV(znode_t *zp) +{ + vnode_t *vp = zp->z_vnode; + + ASSERT(vp == NULL || vp->v_data == NULL || vp->v_data == zp); + return (vp); +} +static __inline znode_t * +VTOZ(vnode_t *vp) +{ + znode_t *zp = (znode_t *)vp->v_data; + + ASSERT(zp == NULL || zp->z_vnode == NULL || zp->z_vnode == vp); + return (zp); +} +#else #define ZTOV(ZP) ((ZP)->z_vnode) #define VTOZ(VP) ((znode_t *)(VP)->v_data) +#endif /* * ZFS_ENTER() is called on entry to each ZFS vnode and vfs operation. Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Tue Sep 15 11:13:40 2009 (r197215) @@ -194,6 +194,10 @@ vdev_geom_worker(void *arg) zio_t *zio; struct bio *bp; + thread_lock(curthread); + sched_prio(curthread, PRIBIO); + thread_unlock(curthread); + ctx = arg; for (;;) { mtx_lock(&ctx->gc_queue_mtx); @@ -203,7 +207,7 @@ vdev_geom_worker(void *arg) ctx->gc_state = 2; wakeup_one(&ctx->gc_state); mtx_unlock(&ctx->gc_queue_mtx); - kproc_exit(0); + kthread_exit(); } msleep(&ctx->gc_queue, &ctx->gc_queue_mtx, PRIBIO | PDROP, "vgeom:io", 0); @@ -530,8 +534,8 @@ vdev_geom_open(vdev_t *vd, uint64_t *psi vd->vdev_tsd = ctx; pp = cp->provider; - kproc_create(vdev_geom_worker, ctx, NULL, 0, 0, "vdev:worker %s", - pp->name); + kproc_kthread_add(vdev_geom_worker, ctx, &zfsproc, NULL, 0, 0, + "zfskern", "vdev %s", pp->name); /* * Determine the actual size of the device. Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Tue Sep 15 11:13:40 2009 (r197215) @@ -181,10 +181,11 @@ mze_compare(const void *arg1, const void return (0); } -static void +static int mze_insert(zap_t *zap, int chunkid, uint64_t hash, mzap_ent_phys_t *mzep) { mzap_ent_t *mze; + avl_index_t idx; ASSERT(zap->zap_ismicro); ASSERT(RW_WRITE_HELD(&zap->zap_rwlock)); @@ -194,7 +195,12 @@ mze_insert(zap_t *zap, int chunkid, uint mze->mze_chunkid = chunkid; mze->mze_hash = hash; mze->mze_phys = *mzep; - avl_add(&zap->zap_m.zap_avl, mze); + if (avl_find(&zap->zap_m.zap_avl, mze, &idx) != NULL) { + kmem_free(mze, sizeof (mzap_ent_t)); + return (EEXIST); + } + avl_insert(&zap->zap_m.zap_avl, mze, idx); + return (0); } static mzap_ent_t * @@ -329,10 +335,15 @@ mzap_open(objset_t *os, uint64_t obj, dm if (mze->mze_name[0]) { zap_name_t *zn; - zap->zap_m.zap_num_entries++; zn = zap_name_alloc(zap, mze->mze_name, MT_EXACT); - mze_insert(zap, i, zn->zn_hash, mze); + if (mze_insert(zap, i, zn->zn_hash, mze) == 0) + zap->zap_m.zap_num_entries++; + else { + printf("ZFS WARNING: Duplicated ZAP " + "entry detected (%s).\n", + mze->mze_name); + } zap_name_free(zn); } } @@ -771,7 +782,7 @@ again: if (zap->zap_m.zap_alloc_next == zap->zap_m.zap_num_chunks) zap->zap_m.zap_alloc_next = 0; - mze_insert(zap, i, zn->zn_hash, mze); + VERIFY(0 == mze_insert(zap, i, zn->zn_hash, mze)); return; } } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c Tue Sep 15 11:13:40 2009 (r197215) @@ -669,9 +669,12 @@ zfsctl_snapdir_remove(vnode_t *dvp, char if (sep) { avl_remove(&sdp->sd_snaps, sep); err = zfsctl_unmount_snap(sep, MS_FORCE, cr); - if (err) - avl_add(&sdp->sd_snaps, sep); - else + if (err) { + avl_index_t where; + + if (avl_find(&sdp->sd_snaps, sep, &where) == NULL) + avl_insert(&sdp->sd_snaps, sep, where); + } else err = dmu_objset_destroy(snapname); } else { err = ENOENT; @@ -877,20 +880,20 @@ domount: mountpoint = kmem_alloc(mountpoint_len, KM_SLEEP); (void) snprintf(mountpoint, mountpoint_len, "%s/.zfs/snapshot/%s", dvp->v_vfsp->mnt_stat.f_mntonname, nm); - err = domount(curthread, *vpp, "zfs", mountpoint, snapname, 0); + err = mount_snapshot(curthread, vpp, "zfs", mountpoint, snapname, 0); kmem_free(mountpoint, mountpoint_len); - /* FreeBSD: This line was moved from below to avoid a lock recursion. */ - if (err == 0) - vn_lock(*vpp, LK_EXCLUSIVE | LK_RETRY); - mutex_exit(&sdp->sd_lock); - /* - * If we had an error, drop our hold on the vnode and - * zfsctl_snapshot_inactive() will clean up. - */ - if (err) { - VN_RELE(*vpp); - *vpp = NULL; + if (err == 0) { + /* + * Fix up the root vnode mounted on .zfs/snapshot/<snapname>. + * + * This is where we lie about our v_vfsp in order to + * make .zfs/snapshot/<snapname> accessible over NFS + * without requiring manual mounts of <snapname>. + */ + ASSERT(VTOZ(*vpp)->z_zfsvfs != zfsvfs); + VTOZ(*vpp)->z_zfsvfs->z_parent = zfsvfs; } + mutex_exit(&sdp->sd_lock); ZFS_EXIT(zfsvfs); return (err); } @@ -1344,7 +1347,17 @@ zfsctl_umount_snapshots(vfs_t *vfsp, int if (vn_ismntpt(sep->se_root)) { error = zfsctl_unmount_snap(sep, fflags, cr); if (error) { - avl_add(&sdp->sd_snaps, sep); + avl_index_t where; + + /* + * Before reinserting snapshot to the tree, + * check if it was actually removed. For example + * when snapshot mount point is busy, we will + * have an error here, but there will be no need + * to reinsert snapshot. + */ + if (avl_find(&sdp->sd_snaps, sep, &where) == NULL) + avl_insert(&sdp->sd_snaps, sep, where); break; } } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Tue Sep 15 11:13:40 2009 (r197215) @@ -3021,8 +3021,10 @@ zfsdev_ioctl(struct cdev *dev, u_long cm if (error == 0) error = zfs_ioc_vec[vec].zvec_func(zc); - if (zfs_ioc_vec[vec].zvec_his_log == B_TRUE) - zfs_log_history(zc); + if (error == 0) { + if (zfs_ioc_vec[vec].zvec_his_log == B_TRUE) + zfs_log_history(zc); + } return (error); } @@ -3057,6 +3059,7 @@ zfsdev_fini(void) } static struct root_hold_token *zfs_root_token; +struct proc *zfsproc; uint_t zfs_fsyncer_key; extern uint_t rrw_tsd_key; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Tue Sep 15 02:25:03 2009 (r197214) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Tue Sep 15 11:13:40 2009 (r197215) @@ -97,6 +97,8 @@ static int zfs_root(vfs_t *vfsp, int fla static int zfs_statfs(vfs_t *vfsp, struct statfs *statp); static int zfs_vget(vfs_t *vfsp, ino_t ino, int flags, vnode_t **vpp); static int zfs_sync(vfs_t *vfsp, int waitfor); +static int zfs_checkexp(vfs_t *vfsp, struct sockaddr *nam, int *extflagsp, + struct ucred **credanonp, int *numsecflavors, int **secflavors); static int zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vnode_t **vpp); static void zfs_objset_close(zfsvfs_t *zfsvfs); static void zfs_freevfs(vfs_t *vfsp); @@ -108,6 +110,7 @@ static struct vfsops zfs_vfsops = { .vfs_statfs = zfs_statfs, .vfs_vget = zfs_vget, .vfs_sync = zfs_sync, + .vfs_checkexp = zfs_checkexp, .vfs_fhtovp = zfs_fhtovp, }; @@ -337,6 +340,13 @@ zfs_register_callbacks(vfs_t *vfsp) os = zfsvfs->z_os; /* + * This function can be called for a snapshot when we update snapshot's + * mount point, which isn't really supported. + */ + if (dmu_objset_is_snapshot(os)) + return (EOPNOTSUPP); + + /* * The act of registering our callbacks will destroy any mount * options we may have. In order to enable temporary overrides * of mount options, we stash away the current values and @@ -719,7 +729,10 @@ zfs_mount(vfs_t *vfsp) error = secpolicy_fs_mount(cr, mvp, vfsp); if (error) { error = dsl_deleg_access(osname, ZFS_DELEG_PERM_MOUNT, cr); - if (error == 0) { + if (error != 0) + goto out; + + if (!(vfsp->vfs_flag & MS_REMOUNT)) { vattr_t vattr; /* @@ -729,7 +742,9 @@ zfs_mount(vfs_t *vfsp) vattr.va_mask = AT_UID; + vn_lock(mvp, LK_SHARED | LK_RETRY); if (error = VOP_GETATTR(mvp, &vattr, cr)) { + VOP_UNLOCK(mvp, 0); goto out; } @@ -741,18 +756,19 @@ zfs_mount(vfs_t *vfsp) } #else if (error = secpolicy_vnode_owner(mvp, cr, vattr.va_uid)) { + VOP_UNLOCK(mvp, 0); goto out; } if (error = VOP_ACCESS(mvp, VWRITE, cr, td)) { + VOP_UNLOCK(mvp, 0); goto out; } + VOP_UNLOCK(mvp, 0); #endif - - secpolicy_fs_mount_clearopts(cr, vfsp); - } else { - goto out; } + + secpolicy_fs_mount_clearopts(cr, vfsp); } /* @@ -931,6 +947,18 @@ zfsvfs_teardown(zfsvfs_t *zfsvfs, boolea zfsvfs->z_unmounted = B_TRUE; rrw_exit(&zfsvfs->z_teardown_lock, FTAG); rw_exit(&zfsvfs->z_teardown_inactive_lock); + +#ifdef __FreeBSD__ + /* + * Some znodes might not be fully reclaimed, wait for them. + */ + mutex_enter(&zfsvfs->z_znodes_lock); + while (list_head(&zfsvfs->z_all_znodes) != NULL) { + msleep(zfsvfs, &zfsvfs->z_znodes_lock, 0, + "zteardown", 0); + } + mutex_exit(&zfsvfs->z_znodes_lock); +#endif } /* @@ -1086,6 +1114,20 @@ zfs_vget(vfs_t *vfsp, ino_t ino, int fla znode_t *zp; int err; + /* + * XXXPJD: zfs_zget() can't operate on virtual entires like .zfs/ or + * .zfs/snapshot/ directories, so for now just return EOPNOTSUPP. + * This will make NFS to fall back to using READDIR instead of + * READDIRPLUS. + * Also snapshots are stored in AVL tree, but based on their names, + * not inode numbers, so it will be very inefficient to iterate + * over all snapshots to find the right one. + * Note that OpenSolaris READDIRPLUS implementation does LOOKUP on + * d_name, and not VGET on d_fileno as we do. + */ + if (ino == ZFSCTL_INO_ROOT || ino == ZFSCTL_INO_SNAPDIR) + return (EOPNOTSUPP); + ZFS_ENTER(zfsvfs); err = zfs_zget(zfsvfs, ino, &zp); if (err == 0 && zp->z_unlinked) { @@ -1103,6 +1145,28 @@ zfs_vget(vfs_t *vfsp, ino_t ino, int fla } static int +zfs_checkexp(vfs_t *vfsp, struct sockaddr *nam, int *extflagsp, + struct ucred **credanonp, int *numsecflavors, int **secflavors) +{ + zfsvfs_t *zfsvfs = vfsp->vfs_data; + + /* + * If this is regular file system vfsp is the same as + * zfsvfs->z_parent->z_vfs, but if it is snapshot, + * zfsvfs->z_parent->z_vfs represents parent file system + * which we have to use here, because only this file system + * has mnt_export configured. + */ + vfsp = zfsvfs->z_parent->z_vfs; + + return (vfs_stdcheckexp(zfsvfs->z_parent->z_vfs, nam, extflagsp, + credanonp, numsecflavors, secflavors)); +} + +CTASSERT(SHORT_FID_LEN <= sizeof(struct fid)); +CTASSERT(LONG_FID_LEN <= sizeof(struct fid)); + +static int zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vnode_t **vpp) { zfsvfs_t *zfsvfs = vfsp->vfs_data; @@ -1117,7 +1181,11 @@ zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vno ZFS_ENTER(zfsvfs); - if (fidp->fid_len == LONG_FID_LEN) { + /* + * On FreeBSD we can get snapshot's mount point or its parent file + * system mount point depending if snapshot is already mounted or not. + */ + if (zfsvfs->z_parent == zfsvfs && fidp->fid_len == LONG_FID_LEN) { zfid_long_t *zlfid = (zfid_long_t *)fidp; uint64_t objsetid = 0; uint64_t setgen = 0; @@ -1160,9 +1228,8 @@ zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vno } else { VN_HOLD(*vpp); } - ZFS_EXIT(zfsvfs); - /* XXX: LK_RETRY? */ vn_lock(*vpp, LK_EXCLUSIVE | LK_RETRY); + ZFS_EXIT(zfsvfs); return (0); } @@ -1184,7 +1251,6 @@ zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vno } *vpp = ZTOV(zp); - /* XXX: LK_RETRY? */ vn_lock(*vpp, LK_EXCLUSIVE | LK_RETRY); vnode_create_vobject(*vpp, zp->z_phys->zp_size, curthread); ZFS_EXIT(zfsvfs); *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200909151113.n8FBDeZ1086175>