From nobody Fri Jul 28 00:17:51 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RBpBD2GcFz4pgWp; Fri, 28 Jul 2023 00:17:56 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-oi1-x233.google.com (mail-oi1-x233.google.com [IPv6:2607:f8b0:4864:20::233]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RBpB93YTpz4Q05; Fri, 28 Jul 2023 00:17:53 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20221208 header.b=WEAm+Oxk; spf=pass (mx1.freebsd.org: domain of mjguzik@gmail.com designates 2607:f8b0:4864:20::233 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-oi1-x233.google.com with SMTP id 5614622812f47-3a3df1ee4a3so1073766b6e.3; Thu, 27 Jul 2023 17:17:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690503472; x=1691108272; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DJGXMP3oorebjcVFxGyIBHnZe4FEaK/sGm/zwbAaG7Y=; b=WEAm+OxkkVsuQilISykP9Wb0omJSZyPi/kMRYAIf7GBnhw3NdJ+rLd6wcXQFBHk3cy eiuosEfAUo5HB9v+tSoc0zKou7z0iFA/xpaUTqT8bNBhALI4MhQ6m8Ipk4MSWQmMdf4b iDWV6OVsMtv6b2I7aiMzl3bwQwg6bRE6zIMWYvXd6WT990m4y+AGdlqkSYGgFZDAcX9t +45fsW5jPEoFbAkYccdVnAYi5bEKtiEW4/fgRUKaESEkUdesC/j4Bqb9wsOmPkzWnCBn csGaUT1WNP6KrZ8f4gbxNVaftQZXaD3CUFOZk6KSWv5MncEO4qYvOEH8uEU/c+Em/89f saKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690503472; x=1691108272; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DJGXMP3oorebjcVFxGyIBHnZe4FEaK/sGm/zwbAaG7Y=; b=Retdjkcz15DFBtDr6NrqBOkTXZCit3j2K1ni0RCHeH10Eyv8ZuzNJBPjQWhGpOvyiO Zjxe3Qgfi//Ms8q4oSdV4Tfsnp8sL5AeE2R5nUJaDuOS7AQRuzh8p5XeYw1nX5jcMuEQ l2a0El3lYY/3L+Ve2546eMCByPUQmQpm47pNkAjzWHJIn0W05vp8RY30ifNKbqgQmo6i +6qRZpJJBpT1OEV1auJ2ESoh2LdTvqahQujMqeVm9gIGjlKGlNPtUU5BFl5Reqrfb3mt W+rZ8MpAEgDPW94n16m19sAD3tBnsfc5+pi+cehLL+2zhSlKaT23nGw16FnlzGc18E+g tJ2w== X-Gm-Message-State: ABy/qLbcQOfcRqe26LPKHm/+AxcYptCZ8pyo1J+FpqfOc3wqEG8iJ9rI uH6vGOeaY7n+6Y/+r9Sdbm711yzA1ybe70XDGgn2NJwB X-Google-Smtp-Source: APBJJlHJN6IhEIxNYGCZ0pjaNLwhKOf64nbZSq++sN5So5PoFjWHqZJ3CyEHVuLlP+hvlbf52dXx5Bg8tRprKMtXGXc= X-Received: by 2002:a05:6808:1b2b:b0:3a1:cbea:3bf2 with SMTP id bx43-20020a0568081b2b00b003a1cbea3bf2mr999395oib.11.1690503471921; Thu, 27 Jul 2023 17:17:51 -0700 (PDT) List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Received: by 2002:ac9:6106:0:b0:4e1:6042:cdc9 with HTTP; Thu, 27 Jul 2023 17:17:51 -0700 (PDT) In-Reply-To: <202307242203.36OM3IwQ009522@gitrepo.freebsd.org> References: <202307242203.36OM3IwQ009522@gitrepo.freebsd.org> From: Mateusz Guzik Date: Fri, 28 Jul 2023 02:17:51 +0200 Message-ID: Subject: Re: git: 5b353925ff61 - main - vnode read(2)/write(2): acquire rangelock regardless of do_vn_io_fault() To: Konstantin Belousov Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Spamd-Result: default: False [-3.65 / 15.00]; NEURAL_HAM_MEDIUM(-0.99)[-0.991]; NEURAL_HAM_SHORT(-0.99)[-0.989]; NEURAL_HAM_LONG(-0.67)[-0.668]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20221208]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; MIME_GOOD(-0.10)[text/plain]; MLMMJ_DEST(0.00)[dev-commits-src-all@freebsd.org,dev-commits-src-main@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::233:from]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ARC_NA(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; MID_RHS_MATCH_FROMTLD(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; FROM_HAS_DN(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_TLS_LAST(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-Rspamd-Queue-Id: 4RBpB93YTpz4Q05 X-Spamd-Bar: --- On 7/25/23, Konstantin Belousov wrote: > The branch main has been updated by kib: > > URL: > https://cgit.FreeBSD.org/src/commit/?id=5b353925ff61b9ddb97bb453ba75278b578ed7d9 > > commit 5b353925ff61b9ddb97bb453ba75278b578ed7d9 > Author: Konstantin Belousov > AuthorDate: 2023-07-23 15:55:50 +0000 > Commit: Konstantin Belousov > CommitDate: 2023-07-24 22:02:59 +0000 > > vnode read(2)/write(2): acquire rangelock regardless of > do_vn_io_fault() > > To ensure atomicity of reads against parallel writes and truncates, > vnode lock was not enough at least since introduction of vn_io_fault(). > That code only take rangelock when it was possible that vn_read() and > vn_write() could drop the vnode lock. > > At least since the introduction of VOP_READ_PGCACHE() which generally > does not lock the vnode at all, rangelocks become required even > for filesystems that do not need vn_io_fault() workaround. For > instance, tmpfs. > Is there a bug with pgcache reads disabled (as in when the vnode lock is held for reads?) Note this patch adds 2 lock trips which were previously not present, which has to slow things down single-threaded, but I did not bother measuring that part. As this adds to vnode-wide *lock* acquires this has to very negatively affect scalability. This time around I ran: ./readseek3_processes -t 10 (10 workers reading from *disjoint* offsets from the same vnode. this in principle can scale perfectly) I observed a 90% drop in performance: before: total:25723459 ops/s after: total:2455794 ops/s Going to an unpatched kernel and disabling pgcache reads instead: disabled: total:6522480 ops/s or about 2.6x of performance of the current kernel In other words I think the thing to do here is to revert the patch and instead flip pgcache reads to off by default until a better fix can be implemented. > PR: 272678 > Analyzed and reviewed by: Andrew Gierth > > Sponsored by: The FreeBSD Foundation > MFC after: 1 week > Differential revision: https://reviews.freebsd.org/D41158 > --- > sys/kern/vfs_vnops.c | 21 ++++++++++++++------- > 1 file changed, 14 insertions(+), 7 deletions(-) > > diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c > index 83e95731d7c4..306840ff0357 100644 > --- a/sys/kern/vfs_vnops.c > +++ b/sys/kern/vfs_vnops.c > @@ -1443,6 +1443,7 @@ vn_io_fault(struct file *fp, struct uio *uio, struct > ucred *active_cred, > void *rl_cookie; > struct vn_io_fault_args args; > int error; > + bool rl_locked; > > doio = uio->uio_rw == UIO_READ ? vn_read : vn_write; > vp = fp->f_vnode; > @@ -1465,12 +1466,7 @@ vn_io_fault(struct file *fp, struct uio *uio, struct > ucred *active_cred, > } > > foffset_lock_uio(fp, uio, flags); > - if (do_vn_io_fault(vp, uio)) { > - args.kind = VN_IO_FAULT_FOP; > - args.args.fop_args.fp = fp; > - args.args.fop_args.doio = doio; > - args.cred = active_cred; > - args.flags = flags | FOF_OFFSET; > + if (vp->v_type == VREG) { > if (uio->uio_rw == UIO_READ) { > rl_cookie = vn_rangelock_rlock(vp, uio->uio_offset, > uio->uio_offset + uio->uio_resid); > @@ -1482,11 +1478,22 @@ vn_io_fault(struct file *fp, struct uio *uio, struct > ucred *active_cred, > rl_cookie = vn_rangelock_wlock(vp, uio->uio_offset, > uio->uio_offset + uio->uio_resid); > } > + rl_locked = true; > + } else { > + rl_locked = false; > + } > + if (do_vn_io_fault(vp, uio)) { > + args.kind = VN_IO_FAULT_FOP; > + args.args.fop_args.fp = fp; > + args.args.fop_args.doio = doio; > + args.cred = active_cred; > + args.flags = flags | FOF_OFFSET; > error = vn_io_fault1(vp, uio, &args, td); > - vn_rangelock_unlock(vp, rl_cookie); > } else { > error = doio(fp, uio, active_cred, flags | FOF_OFFSET, td); > } > + if (rl_locked) > + vn_rangelock_unlock(vp, rl_cookie); > foffset_unlock_uio(fp, uio, flags); > return (error); > } > -- Mateusz Guzik