From owner-svn-src-head@freebsd.org Thu Jan 19 19:15:22 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A1B2ECB8474; Thu, 19 Jan 2017 19:15:22 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7C2521985; Thu, 19 Jan 2017 19:15:22 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v0JJFL1F090581; Thu, 19 Jan 2017 19:15:21 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v0JJFLEL090577; Thu, 19 Jan 2017 19:15:21 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201701191915.v0JJFLEL090577@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Thu, 19 Jan 2017 19:15:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r312428 - head/sys/fs/tmpfs X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2017 19:15:22 -0000 Author: kib Date: Thu Jan 19 19:15:21 2017 New Revision: 312428 URL: https://svnweb.freebsd.org/changeset/base/312428 Log: Refcount tmpfs nodes and mount structures. On dotdot lookup and fhtovp operations, it is possible for the file represented by tmpfs node to be removed after the thread calculated the pointer. In this case, tmpfs_alloc_vp() accesses freed memory. Introduce the reference count on the nodes. The allnodes list from tmpfs mount owns 1 reference, and threads performing unlocked operations on the node, add one transient reference. Similarly, since struct tmpfs_mount maintains the list where nodes are enlisted, refcount it by one reference from struct mount and one reference from each node on the list. Both nodes and tmpfs_mounts are removed when refcount goes to zero. Note that this means that nodes and tmpfs_mounts might survive some time after the node is deleted or tmpfs_unmount() finished. The tmpfs_alloc_vp() in these cases returns error either due to node removal (tn_nlinks == 0) or because of insmntque1(9) error. Tested by: pho (as part of larger patch) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Modified: head/sys/fs/tmpfs/tmpfs.h head/sys/fs/tmpfs/tmpfs_subr.c head/sys/fs/tmpfs/tmpfs_vfsops.c head/sys/fs/tmpfs/tmpfs_vnops.c Modified: head/sys/fs/tmpfs/tmpfs.h ============================================================================== --- head/sys/fs/tmpfs/tmpfs.h Thu Jan 19 18:52:38 2017 (r312427) +++ head/sys/fs/tmpfs/tmpfs.h Thu Jan 19 19:15:21 2017 (r312428) @@ -158,9 +158,11 @@ struct tmpfs_node { * Doubly-linked list entry which links all existing nodes for * a single file system. This is provided to ease the removal * of all nodes during the unmount operation, and to support - * the implementation of VOP_VNTOCNP(). + * the implementation of VOP_VNTOCNP(). tn_attached is false + * when the node is removed from list and unlocked. */ LIST_ENTRY(tmpfs_node) tn_entries; /* (m) */ + bool tn_attached; /* (m) */ /* * The node's type. Any of 'VBLK', 'VCHR', 'VDIR', 'VFIFO', @@ -231,6 +233,9 @@ struct tmpfs_node { */ int tn_vpstate; /* (i) */ + /* Transient refcounter on this node. */ + u_int tn_refcount; /* (m) + (i) */ + /* misc data field for different tn_type node */ union { /* Valid when tn_type == VBLK || tn_type == VCHR. */ @@ -360,6 +365,9 @@ struct tmpfs_mount { /* Number of nodes currently that are in use. */ ino_t tm_nodes_inuse; + /* Refcounter on this struct tmpfs_mount. */ + uint64_t tm_refcount; + /* maximum representable file size */ u_int64_t tm_maxfilesize; @@ -404,10 +412,14 @@ struct tmpfs_dir_cursor { * Prototypes for tmpfs_subr.c. */ +void tmpfs_ref_node(struct tmpfs_node *node); +void tmpfs_ref_node_locked(struct tmpfs_node *node); int tmpfs_alloc_node(struct mount *mp, struct tmpfs_mount *, enum vtype, uid_t uid, gid_t gid, mode_t mode, struct tmpfs_node *, char *, dev_t, struct tmpfs_node **); void tmpfs_free_node(struct tmpfs_mount *, struct tmpfs_node *); +bool tmpfs_free_node_locked(struct tmpfs_mount *, struct tmpfs_node *, bool); +void tmpfs_free_tmp(struct tmpfs_mount *); int tmpfs_alloc_dirent(struct tmpfs_mount *, struct tmpfs_node *, const char *, u_int, struct tmpfs_dirent **); void tmpfs_free_dirent(struct tmpfs_mount *, struct tmpfs_dirent *); Modified: head/sys/fs/tmpfs/tmpfs_subr.c ============================================================================== --- head/sys/fs/tmpfs/tmpfs_subr.c Thu Jan 19 18:52:38 2017 (r312427) +++ head/sys/fs/tmpfs/tmpfs_subr.c Thu Jan 19 19:15:21 2017 (r312428) @@ -131,6 +131,26 @@ tmpfs_pages_check_avail(struct tmpfs_mou return (1); } +void +tmpfs_ref_node(struct tmpfs_node *node) +{ + + TMPFS_NODE_LOCK(node); + tmpfs_ref_node_locked(node); + TMPFS_NODE_UNLOCK(node); +} + +void +tmpfs_ref_node_locked(struct tmpfs_node *node) +{ + + TMPFS_NODE_ASSERT_LOCKED(node); + KASSERT(node->tn_refcount > 0, ("node %p zero refcount", node)); + KASSERT(node->tn_refcount < UINT_MAX, ("node %p refcount %u", node, + node->tn_refcount)); + node->tn_refcount++; +} + /* * Allocates a new node of type 'type' inside the 'tmp' mount point, with * its owner set to 'uid', its group to 'gid' and its mode set to 'mode', @@ -205,6 +225,7 @@ tmpfs_alloc_node(struct mount *mp, struc nnode->tn_gid = gid; nnode->tn_mode = mode; nnode->tn_id = alloc_unr(tmp->tm_ino_unr); + nnode->tn_refcount = 1; /* Type-specific initialization. */ switch (nnode->tn_type) { @@ -258,7 +279,9 @@ tmpfs_alloc_node(struct mount *mp, struc TMPFS_LOCK(tmp); LIST_INSERT_HEAD(&tmp->tm_nodes_used, nnode, tn_entries); + nnode->tn_attached = true; tmp->tm_nodes_inuse++; + tmp->tm_refcount++; TMPFS_UNLOCK(tmp); *node = nnode; @@ -272,18 +295,40 @@ tmpfs_alloc_node(struct mount *mp, struc void tmpfs_free_node(struct tmpfs_mount *tmp, struct tmpfs_node *node) { + + TMPFS_LOCK(tmp); + TMPFS_NODE_LOCK(node); + if (!tmpfs_free_node_locked(tmp, node, false)) { + TMPFS_NODE_UNLOCK(node); + TMPFS_UNLOCK(tmp); + } +} + +bool +tmpfs_free_node_locked(struct tmpfs_mount *tmp, struct tmpfs_node *node, + bool detach) +{ vm_object_t uobj; + TMPFS_MP_ASSERT_LOCKED(tmp); + TMPFS_NODE_ASSERT_LOCKED(node); + KASSERT(node->tn_refcount > 0, ("node %p refcount zero", node)); + + node->tn_refcount--; + if (node->tn_attached && (detach || node->tn_refcount == 0)) { + MPASS(tmp->tm_nodes_inuse > 0); + tmp->tm_nodes_inuse--; + LIST_REMOVE(node, tn_entries); + node->tn_attached = false; + } + if (node->tn_refcount > 0) + return (false); + #ifdef INVARIANTS - TMPFS_NODE_LOCK(node); MPASS(node->tn_vnode == NULL); MPASS((node->tn_vpstate & TMPFS_VNODE_ALLOCATING) == 0); - TMPFS_NODE_UNLOCK(node); #endif - - TMPFS_LOCK(tmp); - LIST_REMOVE(node, tn_entries); - tmp->tm_nodes_inuse--; + TMPFS_NODE_UNLOCK(node); TMPFS_UNLOCK(tmp); switch (node->tn_type) { @@ -323,6 +368,9 @@ tmpfs_free_node(struct tmpfs_mount *tmp, free_unr(tmp->tm_ino_unr, node->tn_id); uma_zfree(tmp->tm_node_pool, node); + TMPFS_LOCK(tmp); + tmpfs_free_tmp(tmp); + return (true); } static __inline uint32_t @@ -468,13 +516,16 @@ tmpfs_alloc_vp(struct mount *mp, struct struct vnode **vpp) { struct vnode *vp; + struct tmpfs_mount *tm; vm_object_t object; int error; error = 0; -loop: + tm = VFS_TO_TMPFS(mp); TMPFS_NODE_LOCK(node); -loop1: + tmpfs_ref_node_locked(node); +loop: + TMPFS_NODE_ASSERT_LOCKED(node); if ((vp = node->tn_vnode) != NULL) { MPASS((node->tn_vpstate & TMPFS_VNODE_DOOMED) == 0); VI_LOCK(vp); @@ -494,12 +545,14 @@ loop1: msleep(&node->tn_vnode, TMPFS_NODE_MTX(node), 0, "tmpfsE", 0); } - goto loop1; + goto loop; } TMPFS_NODE_UNLOCK(node); error = vget(vp, lkflag | LK_INTERLOCK, curthread); - if (error == ENOENT) + if (error == ENOENT) { + TMPFS_NODE_LOCK(node); goto loop; + } if (error != 0) { vp = NULL; goto out; @@ -511,6 +564,7 @@ loop1: */ if (node->tn_vnode == NULL || node->tn_vnode != vp) { vput(vp); + TMPFS_NODE_LOCK(node); goto loop; } @@ -532,11 +586,9 @@ loop1: if (node->tn_vpstate & TMPFS_VNODE_ALLOCATING) { node->tn_vpstate |= TMPFS_VNODE_WANT; error = msleep((caddr_t) &node->tn_vpstate, - TMPFS_NODE_MTX(node), PDROP | PCATCH, - "tmpfs_alloc_vp", 0); - if (error) - return error; - + TMPFS_NODE_MTX(node), 0, "tmpfs_alloc_vp", 0); + if (error != 0) + goto out; goto loop; } else node->tn_vpstate |= TMPFS_VNODE_ALLOCATING; @@ -610,16 +662,17 @@ unlock: TMPFS_NODE_UNLOCK(node); out: - *vpp = vp; + if (error == 0) { + *vpp = vp; #ifdef INVARIANTS - if (error == 0) { MPASS(*vpp != NULL && VOP_ISLOCKED(*vpp)); TMPFS_NODE_LOCK(node); MPASS(*vpp == node->tn_vnode); TMPFS_NODE_UNLOCK(node); - } #endif + } + tmpfs_free_node(tm, node); return (error); } Modified: head/sys/fs/tmpfs/tmpfs_vfsops.c ============================================================================== --- head/sys/fs/tmpfs/tmpfs_vfsops.c Thu Jan 19 18:52:38 2017 (r312427) +++ head/sys/fs/tmpfs/tmpfs_vfsops.c Thu Jan 19 19:15:21 2017 (r312428) @@ -222,6 +222,7 @@ tmpfs_mount(struct mount *mp) mtx_init(&tmp->tm_allnode_lock, "tmpfs allnode lock", NULL, MTX_DEF); tmp->tm_nodes_max = nodes_max; tmp->tm_nodes_inuse = 0; + tmp->tm_refcount = 1; tmp->tm_maxfilesize = maxfilesize > 0 ? maxfilesize : OFF_MAX; LIST_INIT(&tmp->tm_nodes_used); @@ -304,11 +305,35 @@ tmpfs_unmount(struct mount *mp, int mntf TMPFS_LOCK(tmp); while ((node = LIST_FIRST(&tmp->tm_nodes_used)) != NULL) { - TMPFS_UNLOCK(tmp); + TMPFS_NODE_LOCK(node); if (node->tn_type == VDIR) tmpfs_dir_destroy(tmp, node); - tmpfs_free_node(tmp, node); - TMPFS_LOCK(tmp); + if (tmpfs_free_node_locked(tmp, node, true)) + TMPFS_LOCK(tmp); + else + TMPFS_NODE_UNLOCK(node); + } + + mp->mnt_data = NULL; + tmpfs_free_tmp(tmp); + vfs_write_resume(mp, VR_START_WRITE); + + MNT_ILOCK(mp); + mp->mnt_flag &= ~MNT_LOCAL; + MNT_IUNLOCK(mp); + + return (0); +} + +void +tmpfs_free_tmp(struct tmpfs_mount *tmp) +{ + + MPASS(tmp->tm_refcount > 0); + tmp->tm_refcount--; + if (tmp->tm_refcount > 0) { + TMPFS_UNLOCK(tmp); + return; } TMPFS_UNLOCK(tmp); @@ -320,16 +345,7 @@ tmpfs_unmount(struct mount *mp, int mntf MPASS(tmp->tm_pages_used == 0); MPASS(tmp->tm_nodes_inuse == 0); - /* Throw away the tmpfs_mount structure. */ - free(mp->mnt_data, M_TMPFSMNT); - mp->mnt_data = NULL; - vfs_write_resume(mp, VR_START_WRITE); - - MNT_ILOCK(mp); - mp->mnt_flag &= ~MNT_LOCAL; - MNT_IUNLOCK(mp); - - return (0); + free(tmp, M_TMPFSMNT); } static int @@ -347,36 +363,36 @@ static int tmpfs_fhtovp(struct mount *mp, struct fid *fhp, int flags, struct vnode **vpp) { - boolean_t found; struct tmpfs_fid *tfhp; struct tmpfs_mount *tmp; struct tmpfs_node *node; + int error; tmp = VFS_TO_TMPFS(mp); tfhp = (struct tmpfs_fid *)fhp; if (tfhp->tf_len != sizeof(struct tmpfs_fid)) - return EINVAL; + return (EINVAL); if (tfhp->tf_id >= tmp->tm_nodes_max) - return EINVAL; - - found = FALSE; + return (EINVAL); TMPFS_LOCK(tmp); LIST_FOREACH(node, &tmp->tm_nodes_used, tn_entries) { if (node->tn_id == tfhp->tf_id && node->tn_gen == tfhp->tf_gen) { - found = TRUE; + tmpfs_ref_node(node); break; } } TMPFS_UNLOCK(tmp); - if (found) - return (tmpfs_alloc_vp(mp, node, LK_EXCLUSIVE, vpp)); - - return (EINVAL); + if (node != NULL) { + error = tmpfs_alloc_vp(mp, node, LK_EXCLUSIVE, vpp); + tmpfs_free_node(tmp, node); + } else + error = EINVAL; + return (error); } /* ARGSUSED2 */ Modified: head/sys/fs/tmpfs/tmpfs_vnops.c ============================================================================== --- head/sys/fs/tmpfs/tmpfs_vnops.c Thu Jan 19 18:52:38 2017 (r312427) +++ head/sys/fs/tmpfs/tmpfs_vnops.c Thu Jan 19 19:15:21 2017 (r312428) @@ -82,7 +82,8 @@ tmpfs_lookup(struct vop_cachedlookup_arg struct vnode **vpp = v->a_vpp; struct componentname *cnp = v->a_cnp; struct tmpfs_dirent *de; - struct tmpfs_node *dnode; + struct tmpfs_node *dnode, *pnode; + struct tmpfs_mount *tm; int error; dnode = VP_TO_TMPFS_DIR(dvp); @@ -104,8 +105,12 @@ tmpfs_lookup(struct vop_cachedlookup_arg goto out; } if (cnp->cn_flags & ISDOTDOT) { + tm = VFS_TO_TMPFS(dvp->v_mount); + pnode = dnode->tn_dir.tn_parent; + tmpfs_ref_node(pnode); error = vn_vget_ino_gen(dvp, tmpfs_vn_get_ino_alloc, - dnode->tn_dir.tn_parent, cnp->cn_lkflags, vpp); + pnode, cnp->cn_lkflags, vpp); + tmpfs_free_node(tm, pnode); if (error != 0) goto out; } else if (cnp->cn_namelen == 1 && cnp->cn_nameptr[0] == '.') {