Date: Wed, 13 Feb 2019 12:23:38 -0800 From: Mark Millard <marklmi@yahoo.com> To: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, freebsd-hackers Hackers <freebsd-hackers@freebsd.org> Subject: Questions with a powerpc64/powerpc context: relaxed use of smp_cpus in umtx_busy vs. relaxed updates to smp_cpus in machine dependent code? Message-ID: <096EABF3-1876-4E0C-9C16-ECF5C068B189@yahoo.com>
next in thread | raw e-mail | index | archive | help
Why I ask the questions below (after providing context): There are boot issues on old multi-processor PowerMac G5s that frequently hang up during cpu_mp_unleash --but not always. /usr/src/sys/kern/kern_umtx.c has the following code (note the smp_cpus use in the machine-independent code): static inline void umtxq_busy(struct umtx_key *key) { struct umtxq_chain *uc; =20 uc =3D umtxq_getchain(key); mtx_assert(&uc->uc_lock, MA_OWNED); if (uc->uc_busy) { #ifdef SMP if (smp_cpus > 1) { int count =3D BUSY_SPINS; if (count > 0) { umtxq_unlock(key); while (uc->uc_busy && --count > 0) cpu_spinwait(); umtxq_lock(key); } } #endif while (uc->uc_busy) { uc->uc_waiters++; msleep(uc, &uc->uc_lock, 0, "umtxqb", 0); uc->uc_waiters--; } } uc->uc_busy =3D 1; } The use of smp_cpus here on powerpc would be what is called a std::memory_order_relaxed load in c++ terms. smp_cpus does change during the machine dependent-code cpu_mp_unleash in /usr/src/sys/powerpc/powerpc/mp_machdep.c : static void cpu_mp_unleash(void *dummy) { . . . smp_cpus =3D 0; . . . STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { . . . if (pc->pc_awake) { if (bootverbose) printf("Adding CPU %d, hwref=3D%jx, = awake=3D%x\n", pc->pc_cpuid, = (uintmax_t)pc->pc_hwref, pc->pc_awake); smp_cpus++; } else . . .=20 } which are relaxed stores. [This dos not appear to be a std::memory_order_consume like context (no dependency ordered before usage).] /usr/src/sys/kern/subr_smp.c does initialize smp_cpus to 1 in its definition. (But it temporarily reverts to zero in the above code.) So far I've not managed to track down examples of specific code (in an objdump of the kernel, say) that matches up using some form(s) of the following to control access order in the various places umtxq_busy is used: lwsync (acquire/release/AcqRel fence or store-release [with load-acquire = code as well]) or: sync (a.k.a. hwsync and sync 0) (sequentially consistent = fence/store/load) Note: smp_cpus is not even volatile so, potentially, for a time a = register could be all that holds the sequence of smp_cpus values before memory is updated later. Nor have I yet found the earliest use of the umtxq_busy code. If it is late enough after cpu_mp_unleash, that might implicitly provide = something that is not a local code structure. Can anyone point me to example(s) of what controls umtxq_busy = necessarily accessing the intended smp_cpus value? =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?096EABF3-1876-4E0C-9C16-ECF5C068B189>