From nobody Thu Jul 14 14:47:37 2022 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4LkHQf02nfz2pJPZ; Thu, 14 Jul 2022 14:47:38 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4LkHQd4sBmz4778; Thu, 14 Jul 2022 14:47:37 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1657810057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ady2gowhE77isatDeCm4+BfJFVXUg60fYSkN0D//j74=; b=K5Hqz7g4AK0dBshA3T+GhAf/FInP8pmTqEEVTJkUNwelgDBRslAiveUUNGBw4oa4bsz4Y5 EqH8s2sKdtr7QHiCIJ+jBITzw9KAHva3DsYXygy29oE6r+Q6QMMo3ylf1874CzXt+srh39 rY7pZNZIjoZtTb4yWJxUfGCv3RiOqSYi0aHCWkJ2C2sgFTFhQVB/Wf5SZZi536A47vSjoS NOOsvv8hLGLMyAP3KmMNFKkCXTmCd+trgpsFkadgY0UoQpIbEcTKJuh9h5V3DUTHHBFeiK JZlst1rgVs5etT3SzOZgL203aGbrgzca1SE1RqlTI5fszkUnQuUxYI6LOOGIJA== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4LkHQd3klGz16W4; Thu, 14 Jul 2022 14:47:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 26EElbPF041129; Thu, 14 Jul 2022 14:47:37 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 26EElbAZ041128; Thu, 14 Jul 2022 14:47:37 GMT (envelope-from git) Date: Thu, 14 Jul 2022 14:47:37 GMT Message-Id: <202207141447.26EElbAZ041128@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mark Johnston Subject: git: 03f868b163ad - main - x86: Add a required store-load barrier in cpu_idle() List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 03f868b163ad46d6f7cb03dc46fb83ca01fb8f69 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1657810057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ady2gowhE77isatDeCm4+BfJFVXUg60fYSkN0D//j74=; b=HlliQjhFmalaOaO7YyKQv4yNKgx7pTOrihRXuFklxZ7IYWgQSSIqha+Ppc0C2DS1JM1NZL LjECcN70sO0iS0RJybKFqPLDkzFXaSLLih/Z7lzDiwmbZZ0H+LaQQ2RyMQFTDMXv0GhGQT EEhrwyRx8zyPONXoQuAB4CF2PsnAlBOBM3XDsD5fhzCCnDKmS8jh4vbs312o28yxjJXNqD ZKWe9F4GncjYmasnj3a8SY6IndJyuh/H2XIk8oqKsYJPsfQ/XNCT2TqhBhmC8n3gM7M3af i3eAaEhdV6TEpVmX7yB2qBlIgVJ7IsZSBPO9o+NsBCl0SM7Y6G/liBa0OzedPw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1657810057; a=rsa-sha256; cv=none; b=kdhN5hMwN+Xt3jews3JoyavzqIKPLA/dQKj/iHa2XyBJ+kvUd5fw0nauGOl5Ge8/FenUrO 6Lz5DBEUm7R5ffMQ9LxbI+/ZvFN8WorJ+1uIfRincdU3FX/+ppELse0IYZO3Kl8hZwOGNd wZ3Zzo4gE56pRWsazzpwWG/gtlDsaUK5uno7JXl3QwcdMULKMHmPN77x948k3AOmK1XACj kg335JEBTm11aH6w286z90W8sIHUJbXR0ZCXDd+4Fu5RoGbkrgvaHHYRn8b4hQODa4R3hz QyQKZBcMU7YSGXsH/0vtscdifHa9iePVNxfMuVMuTv2KStQheDaDSuIjZauPKg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=03f868b163ad46d6f7cb03dc46fb83ca01fb8f69 commit 03f868b163ad46d6f7cb03dc46fb83ca01fb8f69 Author: Mark Johnston AuthorDate: 2022-07-14 14:24:25 +0000 Commit: Mark Johnston CommitDate: 2022-07-14 14:28:01 +0000 x86: Add a required store-load barrier in cpu_idle() ULE's tdq_notify() tries to avoid delivering IPIs to the idle thread. In particular, it tries to detect whether the idle thread is running. There are two mechanisms for this: - tdq_cpu_idle, an MI flag which is set prior to calling cpu_idle(). If tdq_cpu_idle == 0, then no IPI is needed; - idle_state, an x86-specific state flag which is updated after cpu_idleclock() is called. The implementation of the second mechanism is racy; the race can cause a CPU to go to sleep with pending work. Specifically, cpu_idle_*() set idle_state = STATE_SLEEPING, then check for pending work by loading the tdq_load field of the CPU's runqueue. These operations can be reordered so that the idle thread observes tdq_load == 0, and tdq_notify() observes idle_state == STATE_RUNNING. Some counters indicate that the idle_state check in tdq_notify() frequently elides an IPI. So, fix the problem by inserting a fence after the store to idle_state, immediately before idling the CPU. PR: 264867 Reviewed by: mav, kib, jhb MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D35777 --- sys/x86/x86/cpu_machdep.c | 103 ++++++++++++++++++++++++++++------------------ 1 file changed, 62 insertions(+), 41 deletions(-) diff --git a/sys/x86/x86/cpu_machdep.c b/sys/x86/x86/cpu_machdep.c index fa11f64e2779..040438043c73 100644 --- a/sys/x86/x86/cpu_machdep.c +++ b/sys/x86/x86/cpu_machdep.c @@ -52,6 +52,7 @@ __FBSDID("$FreeBSD$"); #include "opt_maxmem.h" #include "opt_mp_watchdog.h" #include "opt_platform.h" +#include "opt_sched.h" #ifdef __i386__ #include "opt_apic.h" #endif @@ -532,32 +533,25 @@ static int idle_mwait = 1; /* Use MONITOR/MWAIT for short idle. */ SYSCTL_INT(_machdep, OID_AUTO, idle_mwait, CTLFLAG_RWTUN, &idle_mwait, 0, "Use MONITOR/MWAIT for short idle"); -static void -cpu_idle_acpi(sbintime_t sbt) +static bool +cpu_idle_enter(int *statep, int newstate) { - int *state; + KASSERT(atomic_load_int(statep) == STATE_RUNNING, + ("%s: state %d", __func__, atomic_load_int(statep))); - state = &PCPU_PTR(monitorbuf)->idle_state; - atomic_store_int(state, STATE_SLEEPING); - - /* See comments in cpu_idle_hlt(). */ - disable_intr(); - if (sched_runnable()) - enable_intr(); - else if (cpu_idle_hook) - cpu_idle_hook(sbt); - else - acpi_cpu_c1(); - atomic_store_int(state, STATE_RUNNING); -} - -static void -cpu_idle_hlt(sbintime_t sbt) -{ - int *state; - - state = &PCPU_PTR(monitorbuf)->idle_state; - atomic_store_int(state, STATE_SLEEPING); + /* + * A fence is needed to prevent reordering of the load in + * sched_runnable() with this store to the idle state word. Without it, + * cpu_idle_wakeup() can observe the state as STATE_RUNNING after having + * added load to the queue, and elide an IPI. Then, sched_runnable() + * can observe tdq_load == 0, so the CPU ends up idling with pending + * work. tdq_notify() similarly ensures that a prior update to tdq_load + * is visible before calling cpu_idle_wakeup(). + */ + atomic_store_int(statep, newstate); +#if defined(SCHED_ULE) && defined(SMP) + atomic_thread_fence_seq_cst(); +#endif /* * Since we may be in a critical section from cpu_idle(), if @@ -576,35 +570,62 @@ cpu_idle_hlt(sbintime_t sbt) * interrupt. */ disable_intr(); - if (sched_runnable()) + if (sched_runnable()) { enable_intr(); - else - acpi_cpu_c1(); - atomic_store_int(state, STATE_RUNNING); + atomic_store_int(statep, STATE_RUNNING); + return (false); + } else { + return (true); + } } static void -cpu_idle_mwait(sbintime_t sbt) +cpu_idle_exit(int *statep) +{ + atomic_store_int(statep, STATE_RUNNING); +} + +static void +cpu_idle_acpi(sbintime_t sbt) { int *state; state = &PCPU_PTR(monitorbuf)->idle_state; - atomic_store_int(state, STATE_MWAIT); + if (cpu_idle_enter(state, STATE_SLEEPING)) { + if (cpu_idle_hook) + cpu_idle_hook(sbt); + else + acpi_cpu_c1(); + cpu_idle_exit(state); + } +} - /* See comments in cpu_idle_hlt(). */ - disable_intr(); - if (sched_runnable()) { +static void +cpu_idle_hlt(sbintime_t sbt) +{ + int *state; + + state = &PCPU_PTR(monitorbuf)->idle_state; + if (cpu_idle_enter(state, STATE_SLEEPING)) { + acpi_cpu_c1(); atomic_store_int(state, STATE_RUNNING); - enable_intr(); - return; } +} - cpu_monitor(state, 0, 0); - if (atomic_load_int(state) == STATE_MWAIT) - __asm __volatile("sti; mwait" : : "a" (MWAIT_C1), "c" (0)); - else - enable_intr(); - atomic_store_int(state, STATE_RUNNING); +static void +cpu_idle_mwait(sbintime_t sbt) +{ + int *state; + + state = &PCPU_PTR(monitorbuf)->idle_state; + if (cpu_idle_enter(state, STATE_MWAIT)) { + cpu_monitor(state, 0, 0); + if (atomic_load_int(state) == STATE_MWAIT) + __asm __volatile("sti; mwait" : : "a" (MWAIT_C1), "c" (0)); + else + enable_intr(); + cpu_idle_exit(state); + } } static void