From owner-svn-src-head@freebsd.org Mon Jan 13 02:40:26 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 78E011F3682; Mon, 13 Jan 2020 02:40:26 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47wyVx3GqYz3N40; Mon, 13 Jan 2020 02:40:25 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm1-x341.google.com with SMTP id d73so7973592wmd.1; Sun, 12 Jan 2020 18:40:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=al//UW64AU8rBLvfOpHZP4xZ0Ue9N8HiP2yGyGRHoq8=; b=Q++QMMjEbflcW/kqXNZFgcSF3P/9sj7mwdUZ1uBSX2CI2uJ1MmlPbmdLvQlZ1C0CA5 FchQvLtlrsRTyL53tpg5mRid5jecnr5xKxnh3DxVZCnH6UUHFerXGsciHoSLl+DN+0Qa l+nlXEB1Q6Pw/90TFzXT7oWtJc6Ds587XeIaQnAef0+ERVjK6lJwN/FKSj0qu+HtZmy4 sqnnUwVvLe2sZ/gpF50HYorfeD+ZnywhfHQjEMaLY7elf4qdar3co6vTghXwgFrAKEX3 Lvji7pTul0paUCt+D4lMqNO2ukHSsTC9JqilG/mTnHueeim2aey49zumeQAz0e+2SumQ y4nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=al//UW64AU8rBLvfOpHZP4xZ0Ue9N8HiP2yGyGRHoq8=; b=ZL4Me0AuU/ZG9InFYto4azJFA9cYF9umTiPRM7zT/MZVqO/s+bftVH+bLsxi0YCtZR u/nEmtZ5VyvsSrN+fp0PRgKorI8ACyT0rWuAkxz+BnoWT8fPzFic+EBDd1SgzwGiCO+I pRk4UaU5MdX5eYPJ453f3fa447nQIb3qVTuH4giADxw8UxHcazAo2d6jjwIxtA1Tq9Ja 2Y0oX3vGSoUxV7ajAsWDTGQr8K9oTrG81A1/Ueun1n3OrDuWYdkUml/qEMS8ATvllQwG af9IXxTfhQ1JVa+96s85nnX4eX8j9Vliw66FHh4ZuemiRD35RVSI1Z4vKnOGqKqgsTrN m75w== X-Gm-Message-State: APjAAAVU4t9aysOkfVaNXsfJM2QwA7lCeg0ZjvcqDijMOlObg0u8YcA4 /f6oIMHcBRlKiFKNNuOAxn8Eqj23CBV2pZrT6yKc0w== X-Google-Smtp-Source: APXvYqye+1/eIlai3f6r18SB+r4DahLuzzs6SaOwxojA7D3D8PsuUdi2M/ROGxgxS/He3Fp9EhOQgdSqtEmPzkdVmm8= X-Received: by 2002:a1c:6a07:: with SMTP id f7mr17563727wmc.171.1578883223558; Sun, 12 Jan 2020 18:40:23 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a5d:6b02:0:0:0:0:0 with HTTP; Sun, 12 Jan 2020 18:40:22 -0800 (PST) In-Reply-To: <202001130239.00D2df0x028071@repo.freebsd.org> References: <202001130239.00D2df0x028071@repo.freebsd.org> From: Mateusz Guzik Date: Mon, 13 Jan 2020 03:40:22 +0100 Message-ID: Subject: Re: svn commit: r356673 - in head/sys: kern sys To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 47wyVx3GqYz3N40 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=Q++QMMjE; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of mjguzik@gmail.com designates 2a00:1450:4864:20::341 as permitted sender) smtp.mailfrom=mjguzik@gmail.com X-Spamd-Result: default: False [-3.00 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; IP_SCORE(0.00)[ip: (2.60), ipnet: 2a00:1450::/32(-2.60), asn: 15169(-1.84), country: US(-0.05)]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RCVD_IN_DNSWL_NONE(0.00)[1.4.3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.5.4.1.0.0.a.2.list.dnswl.org : 127.0.5.0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0] X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2020 02:40:26 -0000 On 1/13/20, Mateusz Guzik wrote: > Author: mjg > Date: Mon Jan 13 02:39:41 2020 > New Revision: 356673 > URL: https://svnweb.freebsd.org/changeset/base/356673 > > Log: > vfs: per-cpu batched requeuing of free vnodes > > Constant requeuing adds significant lock contention in certain > workloads. Lessen the problem by batching it. > > Per-cpu areas are locked in order to synchronize against UMA freeing > memory. > > vnode's v_mflag is converted to short to prevent the struct from > growing. > > Sample result from an incremental make -s -j 104 bzImage on tmpfs: > stock: 122.38s user 1780.45s system 6242% cpu 30.480 total > patched: 144.84s user 985.90s system 4856% cpu 23.282 total > > Reviewed by: jeff That's: jeff (previous version) > Tested by: pho (in a larger patch, previous version) > Differential Revision: https://reviews.freebsd.org/D22998 > > Modified: > head/sys/kern/vfs_subr.c > head/sys/sys/vnode.h > > Modified: head/sys/kern/vfs_subr.c > ============================================================================== > --- head/sys/kern/vfs_subr.c Mon Jan 13 02:37:25 2020 (r356672) > +++ head/sys/kern/vfs_subr.c Mon Jan 13 02:39:41 2020 (r356673) > @@ -295,6 +295,16 @@ static int stat_rush_requests; /* number of times I/O > SYSCTL_INT(_debug, OID_AUTO, rush_requests, CTLFLAG_RW, > &stat_rush_requests, 0, > "Number of times I/O speeded up (rush requests)"); > > +#define VDBATCH_SIZE 8 > +struct vdbatch { > + u_int index; > + struct mtx lock; > + struct vnode *tab[VDBATCH_SIZE]; > +}; > +DPCPU_DEFINE_STATIC(struct vdbatch, vd); > + > +static void vdbatch_dequeue(struct vnode *vp); > + > /* > * When shutting down the syncer, run it at four times normal speed. > */ > @@ -552,6 +562,8 @@ vnode_init(void *mem, int size, int flags) > */ > rangelock_init(&vp->v_rl); > > + vp->v_dbatchcpu = NOCPU; > + > mtx_lock(&vnode_list_mtx); > TAILQ_INSERT_BEFORE(vnode_list_free_marker, vp, v_vnodelist); > mtx_unlock(&vnode_list_mtx); > @@ -568,6 +580,7 @@ vnode_fini(void *mem, int size) > struct bufobj *bo; > > vp = mem; > + vdbatch_dequeue(vp); > mtx_lock(&vnode_list_mtx); > TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); > mtx_unlock(&vnode_list_mtx); > @@ -602,8 +615,9 @@ vnode_fini(void *mem, int size) > static void > vntblinit(void *dummy __unused) > { > + struct vdbatch *vd; > + int cpu, physvnodes, virtvnodes; > u_int i; > - int physvnodes, virtvnodes; > > /* > * Desiredvnodes is a function of the physical memory size and the > @@ -669,6 +683,12 @@ vntblinit(void *dummy __unused) > for (i = 1; i <= sizeof(struct vnode); i <<= 1) > vnsz2log++; > vnsz2log--; > + > + CPU_FOREACH(cpu) { > + vd = DPCPU_ID_PTR((cpu), vd); > + bzero(vd, sizeof(*vd)); > + mtx_init(&vd->lock, "vdbatch", NULL, MTX_DEF); > + } > } > SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NULL); > > @@ -3199,7 +3219,99 @@ vholdnz(struct vnode *vp) > #endif > } > > +static void __noinline > +vdbatch_process(struct vdbatch *vd) > +{ > + struct vnode *vp; > + int i; > + > + mtx_assert(&vd->lock, MA_OWNED); > + MPASS(vd->index == VDBATCH_SIZE); > + > + mtx_lock(&vnode_list_mtx); > + for (i = 0; i < VDBATCH_SIZE; i++) { > + vp = vd->tab[i]; > + TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); > + TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist); > + MPASS(vp->v_dbatchcpu != NOCPU); > + vp->v_dbatchcpu = NOCPU; > + } > + bzero(vd->tab, sizeof(vd->tab)); > + vd->index = 0; > + mtx_unlock(&vnode_list_mtx); > +} > + > +static void > +vdbatch_enqueue(struct vnode *vp) > +{ > + struct vdbatch *vd; > + > + ASSERT_VI_LOCKED(vp, __func__); > + VNASSERT(!VN_IS_DOOMED(vp), vp, > + ("%s: deferring requeue of a doomed vnode", __func__)); > + > + if (vp->v_dbatchcpu != NOCPU) { > + VI_UNLOCK(vp); > + return; > + } > + > + /* > + * A hack: pin us to the current CPU so that we know what to put in > + * ->v_dbatchcpu. > + */ > + sched_pin(); > + vd = DPCPU_PTR(vd); > + mtx_lock(&vd->lock); > + MPASS(vd->index < VDBATCH_SIZE); > + MPASS(vd->tab[vd->index] == NULL); > + vp->v_dbatchcpu = curcpu; > + vd->tab[vd->index] = vp; > + vd->index++; > + VI_UNLOCK(vp); > + if (vd->index == VDBATCH_SIZE) > + vdbatch_process(vd); > + mtx_unlock(&vd->lock); > + sched_unpin(); > +} > + > /* > + * This routine must only be called for vnodes which are about to be > + * deallocated. Supporting dequeue for arbitrary vndoes would require > + * validating that the locked batch matches. > + */ > +static void > +vdbatch_dequeue(struct vnode *vp) > +{ > + struct vdbatch *vd; > + int i; > + short cpu; > + > + VNASSERT(vp->v_type == VBAD || vp->v_type == VNON, vp, > + ("%s: called for a used vnode\n", __func__)); > + > + cpu = vp->v_dbatchcpu; > + if (cpu == NOCPU) > + return; > + > + vd = DPCPU_ID_PTR(cpu, vd); > + mtx_lock(&vd->lock); > + for (i = 0; i < vd->index; i++) { > + if (vd->tab[i] != vp) > + continue; > + vp->v_dbatchcpu = NOCPU; > + vd->index--; > + vd->tab[i] = vd->tab[vd->index]; > + vd->tab[vd->index] = NULL; > + break; > + } > + mtx_unlock(&vd->lock); > + /* > + * Either we dequeued the vnode above or the target CPU beat us to it. > + */ > + MPASS(vp->v_dbatchcpu == NOCPU); > +} > + > +/* > * Drop the hold count of the vnode. If this is the last reference to > * the vnode we place it on the free list unless it has been vgone'd > * (marked VIRF_DOOMED) in which case we will free it. > @@ -3236,12 +3348,8 @@ vdrop_deactivate(struct vnode *vp) > mp->mnt_lazyvnodelistsize--; > mtx_unlock(&mp->mnt_listmtx); > } > - mtx_lock(&vnode_list_mtx); > - TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); > - TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist); > - mtx_unlock(&vnode_list_mtx); > atomic_add_long(&freevnodes, 1); > - VI_UNLOCK(vp); > + vdbatch_enqueue(vp); > } > > void > > Modified: head/sys/sys/vnode.h > ============================================================================== > --- head/sys/sys/vnode.h Mon Jan 13 02:37:25 2020 (r356672) > +++ head/sys/sys/vnode.h Mon Jan 13 02:39:41 2020 (r356673) > @@ -171,7 +171,8 @@ struct vnode { > u_int v_usecount; /* I ref count of users */ > u_int v_iflag; /* i vnode flags (see below) */ > u_int v_vflag; /* v vnode flags */ > - u_int v_mflag; /* l mnt-specific vnode flags */ > + u_short v_mflag; /* l mnt-specific vnode flags */ > + short v_dbatchcpu; /* i LRU requeue deferral batch */ > int v_writecount; /* I ref count of writers or > (negative) text users */ > u_int v_hash; > _______________________________________________ > svn-src-all@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-all > To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" > -- Mateusz Guzik