From nobody Mon Nov 29 14:21:01 2021 X-Original-To: dev-commits-src-branches@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 2972718B0381; Mon, 29 Nov 2021 14:21:03 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4J2nZk3NpWz3h8j; Mon, 29 Nov 2021 14:21:02 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B8E0C23CD1; Mon, 29 Nov 2021 14:21:01 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 1ATEL1M3055955; Mon, 29 Nov 2021 14:21:01 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 1ATEL16x055954; Mon, 29 Nov 2021 14:21:01 GMT (envelope-from git) Date: Mon, 29 Nov 2021 14:21:01 GMT Message-Id: <202111291421.1ATEL16x055954@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Mark Johnston Subject: git: fdd27db34802 - stable/13 - vm: Add a mode to vm_object_page_remove() which skips invalid pages List-Id: Commits to the stable branches of the FreeBSD src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-branches List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-branches@freebsd.org X-BeenThere: dev-commits-src-branches@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: fdd27db34802decf062339411e5f84993e733be0 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1638195662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=sMWaKpPXLXHjsNJcn3B9m/rwEgobkxN0E8OvtmWECiY=; b=fgikVjcCQQVMdrfinz5gzm2M5CqblXSOshqaXu6ojFvyWU1Ln3mc4Rj++bsA5Tz2Ykb84b wZSPrFiIGrMSt4PSq9zbjij2rzU7qOwvM2VayLf6p9jT2/ZhXRcIAv3AsduN2Q2tidSfB4 aHiLPL/ufoKkKYF/2GaXHsSB67jPP4hRfjYKiDEv8FY0DOguwceoxTkh/OBKYG6pbYB4S0 hQlUyxhH7PYFbB1odCe0ed8sSYbhDJldV3TnozHFXg7YXXP+4rUrA+ueHhySNqndDyHt0I ZuPkkh8vnnSzLrEiE/XBGgk8EJm8WGUwjxOD/ytbtzNl6/74hc8LiAGXBRrnYA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1638195662; a=rsa-sha256; cv=none; b=lxqU5YwJf63IsJhRyDvY1l/oZAwiIp4JMAE3M8zMwsf6i7xoCNvv1h1XqUC52Hup1bjCi1 KjjhyuN3JSXGBpMNiSzqbMYw/xOybc0B3LxkkwWGHyT4cN1cUh8kKh5lHWjIYzWuAqEg0T 9dzfdV04Uqk4Hek3K3B0AnJkankvYK1KkFdTr1ReRSZmherOGqT40ecpZWWpGKHnCiWBUA ib3R2rY8xO/7BiAtR7ugOVbDp6k01RCcw3rNEHKWc5oAJ4M6yM7SUwJrgEL7rC/1CKUEDL rF4TXG6KybUkdXdvvCquTEFrC9SWUq2Nu5gCPDVURYR8cTXvXi4gCumerKf3rw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch stable/13 has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=fdd27db34802decf062339411e5f84993e733be0 commit fdd27db34802decf062339411e5f84993e733be0 Author: Mark Johnston AuthorDate: 2021-11-15 16:44:04 +0000 Commit: Mark Johnston CommitDate: 2021-11-29 14:09:28 +0000 vm: Add a mode to vm_object_page_remove() which skips invalid pages This will be used to break a deadlock in ZFS between the per-mountpoint teardown lock and page busy locks. In particular, when purging data from the page cache during dataset rollback, we want to avoid blocking on the busy state of invalid pages since the busying thread may be blocked on the teardown lock in zfs_getpages(). Add a helper, vn_pages_remove_valid(), for use by filesystems. Bump __FreeBSD_version so that the OpenZFS port can make use of the new helper. PR: 258208 Reviewed by: avg, kib, sef Tested by: pho (part of a larger patch) Sponsored by: The FreeBSD Foundation (cherry picked from commit d28af1abf031ee87a478b37180e3f0c518caedf6) --- sys/kern/vfs_vnops.c | 22 ++++++++++++++++++++++ sys/sys/vnode.h | 2 ++ sys/vm/vm_object.c | 19 +++++++++++++++++++ sys/vm/vm_object.h | 1 + 4 files changed, 44 insertions(+) diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c index b78c24e3e313..afb1c6799825 100644 --- a/sys/kern/vfs_vnops.c +++ b/sys/kern/vfs_vnops.c @@ -2425,6 +2425,10 @@ vn_chown(struct file *fp, uid_t uid, gid_t gid, struct ucred *active_cred, return (setfown(td, active_cred, vp, uid, gid)); } +/* + * Remove pages in the range ["start", "end") from the vnode's VM object. If + * "end" is 0, then the range extends to the end of the object. + */ void vn_pages_remove(struct vnode *vp, vm_pindex_t start, vm_pindex_t end) { @@ -2437,6 +2441,24 @@ vn_pages_remove(struct vnode *vp, vm_pindex_t start, vm_pindex_t end) VM_OBJECT_WUNLOCK(object); } +/* + * Like vn_pages_remove(), but skips invalid pages, which by definition are not + * mapped into any process' address space. Filesystems may use this in + * preference to vn_pages_remove() to avoid blocking on pages busied in + * preparation for a VOP_GETPAGES. + */ +void +vn_pages_remove_valid(struct vnode *vp, vm_pindex_t start, vm_pindex_t end) +{ + vm_object_t object; + + if ((object = vp->v_object) == NULL) + return; + VM_OBJECT_WLOCK(object); + vm_object_page_remove(object, start, end, OBJPR_VALIDONLY); + VM_OBJECT_WUNLOCK(object); +} + int vn_bmap_seekhole(struct vnode *vp, u_long cmd, off_t *off, struct ucred *cred) { diff --git a/sys/sys/vnode.h b/sys/sys/vnode.h index ba5eafc80d4b..66e8a7c0a87e 100644 --- a/sys/sys/vnode.h +++ b/sys/sys/vnode.h @@ -763,6 +763,8 @@ int vn_open_cred(struct nameidata *ndp, int *flagp, int cmode, int vn_open_vnode(struct vnode *vp, int fmode, struct ucred *cred, struct thread *td, struct file *fp); void vn_pages_remove(struct vnode *vp, vm_pindex_t start, vm_pindex_t end); +void vn_pages_remove_valid(struct vnode *vp, vm_pindex_t start, + vm_pindex_t end); int vn_pollrecord(struct vnode *vp, struct thread *p, int events); int vn_rdwr(enum uio_rw rw, struct vnode *vp, void *base, int len, off_t offset, enum uio_seg segflg, int ioflg, diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c index 5bbe7faed50b..47595d38137c 100644 --- a/sys/vm/vm_object.c +++ b/sys/vm/vm_object.c @@ -2094,6 +2094,21 @@ again: for (; p != NULL && (p->pindex < end || end == 0); p = next) { next = TAILQ_NEXT(p, listq); + /* + * Skip invalid pages if asked to do so. Try to avoid acquiring + * the busy lock, as some consumers rely on this to avoid + * deadlocks. + * + * A thread may concurrently transition the page from invalid to + * valid using only the busy lock, so the result of this check + * is immediately stale. It is up to consumers to handle this, + * for instance by ensuring that all invalid->valid transitions + * happen with a mutex held, as may be possible for a + * filesystem. + */ + if ((options & OBJPR_VALIDONLY) != 0 && vm_page_none_valid(p)) + continue; + /* * If the page is wired for any reason besides the existence * of managed, wired mappings, then it cannot be freed. For @@ -2106,6 +2121,10 @@ again: vm_page_sleep_if_busy(p, "vmopar"); goto again; } + if ((options & OBJPR_VALIDONLY) != 0 && vm_page_none_valid(p)) { + vm_page_xunbusy(p); + continue; + } if (vm_page_wired(p)) { wired: if ((options & OBJPR_NOTMAPPED) == 0 && diff --git a/sys/vm/vm_object.h b/sys/vm/vm_object.h index adbe022417f4..2a16d8c6f096 100644 --- a/sys/vm/vm_object.h +++ b/sys/vm/vm_object.h @@ -232,6 +232,7 @@ struct vm_object { */ #define OBJPR_CLEANONLY 0x1 /* Don't remove dirty pages. */ #define OBJPR_NOTMAPPED 0x2 /* Don't unmap pages. */ +#define OBJPR_VALIDONLY 0x4 /* Ignore invalid pages. */ TAILQ_HEAD(object_q, vm_object);