From owner-dev-commits-src-main@freebsd.org Mon Aug 2 02:58:36 2021 Return-Path: Delivered-To: dev-commits-src-main@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B345766ADCB; Mon, 2 Aug 2021 02:58:36 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GdN4D4d4yz4fW5; Mon, 2 Aug 2021 02:58:36 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 8730C17CED; Mon, 2 Aug 2021 02:58:36 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 1722waXZ094153; Mon, 2 Aug 2021 02:58:36 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 1722walG094152; Mon, 2 Aug 2021 02:58:36 GMT (envelope-from git) Date: Mon, 2 Aug 2021 02:58:36 GMT Message-Id: <202108020258.1722walG094152@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Alexander Motin Subject: git: 8bb173fb5bc3 - main - sched_ule(4): Use trylock when stealing load. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mav X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 8bb173fb5bc33a02d5a4670c9a60bba0ece07bac Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-main@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commit messages for the main branch of the src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Aug 2021 02:58:36 -0000 The branch main has been updated by mav: URL: https://cgit.FreeBSD.org/src/commit/?id=8bb173fb5bc33a02d5a4670c9a60bba0ece07bac commit 8bb173fb5bc33a02d5a4670c9a60bba0ece07bac Author: Alexander Motin AuthorDate: 2021-08-02 02:42:01 +0000 Commit: Alexander Motin CommitDate: 2021-08-02 02:42:01 +0000 sched_ule(4): Use trylock when stealing load. On some load patterns it is possible for several CPUs to try steal thread from the same CPU despite randomization introduced. It may cause significant lock contention when holding one queue lock idle thread tries to acquire another one. Use of trylock on the remote queue allows both reduce the contention and handle lock ordering easier. If we can't get lock inside tdq_trysteal() we just return, allowing tdq_idled() handle it. If it happens in tdq_idled(), then we repeat search for load skipping this CPU. On 2-socket 80-thread Xeon system I am observing dramatic reduction of the lock spinning time when doing random uncached 4KB reads from 12 ZVOLs, while IOPS increase from 327K to 403K. MFC after: 1 month --- sys/kern/sched_ule.c | 39 +++++++++++++++++++++++++-------------- 1 file changed, 25 insertions(+), 14 deletions(-) diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c index 028e07efa889..1bdcfb1f793d 100644 --- a/sys/kern/sched_ule.c +++ b/sys/kern/sched_ule.c @@ -300,6 +300,8 @@ static struct tdq tdq_cpu; #define TDQ_LOCK_ASSERT(t, type) mtx_assert(TDQ_LOCKPTR((t)), (type)) #define TDQ_LOCK(t) mtx_lock_spin(TDQ_LOCKPTR((t))) #define TDQ_LOCK_FLAGS(t, f) mtx_lock_spin_flags(TDQ_LOCKPTR((t)), (f)) +#define TDQ_TRYLOCK(t) mtx_trylock_spin(TDQ_LOCKPTR((t))) +#define TDQ_TRYLOCK_FLAGS(t, f) mtx_trylock_spin_flags(TDQ_LOCKPTR((t)), (f)) #define TDQ_UNLOCK(t) mtx_unlock_spin(TDQ_LOCKPTR((t))) #define TDQ_LOCKPTR(t) ((struct mtx *)(&(t)->tdq_lock)) @@ -989,13 +991,22 @@ tdq_idled(struct tdq *tdq) if (steal->tdq_load < steal_thresh || steal->tdq_transferable == 0) goto restart; - tdq_lock_pair(tdq, steal); /* - * We were assigned a thread while waiting for the locks. - * Switch to it now instead of stealing a thread. + * Try to lock both queues. If we are assigned a thread while + * waited for the lock, switch to it now instead of stealing. + * If we can't get the lock, then somebody likely got there + * first so continue searching. */ - if (tdq->tdq_load) - break; + TDQ_LOCK(tdq); + if (tdq->tdq_load > 0) { + mi_switch(SW_VOL | SWT_IDLE); + return (0); + } + if (TDQ_TRYLOCK_FLAGS(steal, MTX_DUPOK) == 0) { + TDQ_UNLOCK(tdq); + CPU_CLR(cpu, &mask); + continue; + } /* * The data returned by sched_highest() is stale and * the chosen CPU no longer has an eligible thread, or @@ -1948,18 +1959,18 @@ tdq_trysteal(struct tdq *tdq) if (steal->tdq_load < steal_thresh || steal->tdq_transferable == 0) continue; - tdq_lock_pair(tdq, steal); /* - * If we get to this point, unconditonally exit the loop - * to bound the time spent in the critcal section. - * - * If a thread was added while interrupts were disabled don't - * steal one here. + * Try to lock both queues. If we are assigned a thread while + * waited for the lock, switch to it now instead of stealing. + * If we can't get the lock, then somebody likely got there + * first. At this point unconditonally exit the loop to + * bound the time spent in the critcal section. */ - if (tdq->tdq_load > 0) { - TDQ_UNLOCK(steal); + TDQ_LOCK(tdq); + if (tdq->tdq_load > 0) + break; + if (TDQ_TRYLOCK_FLAGS(steal, MTX_DUPOK) == 0) break; - } /* * The data returned by sched_highest() is stale and * the chosen CPU no longer has an eligible thread.