From owner-freebsd-hackers@freebsd.org Wed Feb 13 21:45:24 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 91BD914D8FC2; Wed, 13 Feb 2019 21:45:24 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from smtp.vangyzen.net (hotblack.vangyzen.net [IPv6:2607:fc50:1000:7400:216:3eff:fe72:314f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AB47775DB4; Wed, 13 Feb 2019 21:45:23 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from disco.vangyzen.net (unknown [70.97.188.230]) by smtp.vangyzen.net (Postfix) with ESMTPSA id D886C56468; Wed, 13 Feb 2019 15:45:21 -0600 (CST) Subject: Re: Questions with a powerpc64/powerpc context: relaxed use of smp_cpus in umtx_busy vs. relaxed updates to smp_cpus in machine dependent code? To: Mark Millard , FreeBSD PowerPC ML , freebsd-hackers Hackers References: <096EABF3-1876-4E0C-9C16-ECF5C068B189@yahoo.com> From: Eric van Gyzen Message-ID: <4b60c6a0-76d5-813c-11c0-9983ba45f7a5@vangyzen.net> Date: Wed, 13 Feb 2019 15:45:18 -0600 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <096EABF3-1876-4E0C-9C16-ECF5C068B189@yahoo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: AB47775DB4 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of eric@vangyzen.net designates 2607:fc50:1000:7400:216:3eff:fe72:314f as permitted sender) smtp.mailfrom=eric@vangyzen.net X-Spamd-Result: default: False [-5.40 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+a]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; SUBJECT_ENDS_QUESTION(1.00)[]; DMARC_NA(0.00)[vangyzen.net]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[hotblack.vangyzen.net]; NEURAL_HAM_SHORT(-0.89)[-0.895,0]; IP_SCORE(-3.19)[ip: (-7.89), ipnet: 2607:fc50:1000::/36(-4.10), asn: 36236(-3.90), country: US(-0.07)]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:36236, ipnet:2607:fc50:1000::/36, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Feb 2019 21:45:24 -0000 On 2/13/19 2:23 PM, Mark Millard via freebsd-hackers wrote: > Why I ask the questions below (after providing context): > There are boot issues on old multi-processor PowerMac G5s that > frequently hang up during cpu_mp_unleash --but not always. > > > /usr/src/sys/kern/kern_umtx.c has the following code > (note the smp_cpus use in the machine-independent code): > > > static inline void > umtxq_busy(struct umtx_key *key) > { > struct umtxq_chain *uc; > > uc = umtxq_getchain(key); > mtx_assert(&uc->uc_lock, MA_OWNED); > if (uc->uc_busy) { > #ifdef SMP > if (smp_cpus > 1) { > int count = BUSY_SPINS; > if (count > 0) { > umtxq_unlock(key); > while (uc->uc_busy && --count > 0) > cpu_spinwait(); > umtxq_lock(key); > } > } > #endif > while (uc->uc_busy) { > uc->uc_waiters++; > msleep(uc, &uc->uc_lock, 0, "umtxqb", 0); > uc->uc_waiters--; > } > } > uc->uc_busy = 1; > } > > The use of smp_cpus here on powerpc would be what is called > a std::memory_order_relaxed load in c++ terms. smp_cpus > does change during the machine dependent-code cpu_mp_unleash > in /usr/src/sys/powerpc/powerpc/mp_machdep.c : > > static void > cpu_mp_unleash(void *dummy) > { > . . . > smp_cpus = 0; > . . . > STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { > . . . > if (pc->pc_awake) { > if (bootverbose) > printf("Adding CPU %d, hwref=%jx, awake=%x\n", > pc->pc_cpuid, (uintmax_t)pc->pc_hwref, > pc->pc_awake); > smp_cpus++; > } else > . . . > } > > which are relaxed stores. > > [This dos not appear to be a std::memory_order_consume like > context (no dependency ordered before usage).] > > /usr/src/sys/kern/subr_smp.c does initialize smp_cpus to 1 > in its definition. (But it temporarily reverts to zero in > the above code.) > > So far I've not managed to track down examples of specific > code (in an objdump of the kernel, say) that matches up > using some form(s) of the following to control access > order in the various places umtxq_busy is used: > > lwsync (acquire/release/AcqRel fence or store-release [with load-acquire code as well]) > or: > sync (a.k.a. hwsync and sync 0) (sequentially consistent fence/store/load) > > Note: smp_cpus is not even volatile so, potentially, for a time a register > could be all that holds the sequence of smp_cpus values before memory is > updated later. > > Nor have I yet found the earliest use of the umtxq_busy code. If it is > late enough after cpu_mp_unleash, that might implicitly provide something > that is not a local code structure. > > Can anyone point me to example(s) of what controls umtxq_busy necessarily > accessing the intended smp_cpus value? umtxq_busy() is only called by userland synchronization primitives, such as mutexes, condition variables, and semaphores. Assuming cpu_mp_unleash() is called before userland is started, umtxq_busy() should see the correct value of smp_cpus. However, even if umtxq_busy() sees a value of 0 or 1 when the correct value would be greater than 1, I don't see how this could cause a problem, since it would take the safer approach of sleeping instead of spinning. Best of luck, Eric