From owner-freebsd-ppc@freebsd.org Mon May 13 10:23:48 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CB602158A1C8 for ; Mon, 13 May 2019 10:23:47 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic305-19.consmr.mail.gq1.yahoo.com (sonic305-19.consmr.mail.gq1.yahoo.com [98.137.64.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7D64B72E13 for ; Mon, 13 May 2019 10:23:46 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: I0kRRHkVM1mOdHDrzh0om9H2GFHtL0Gr3BYvKOeZ6NtzQxTyyJOllGTMJdbsNZV 1SeDUN72_xdOTHGAizADaOAVnHJWOORM7pV34wYBOGb0QdZ_TulasVfDfaurJ6jJ8gVFzZqK.cZu YxIG1fiqmwmRLh.TkZrfSmYnaCme1oVlUuorGcqpM9UK.H9M4BGL8ED6XBO_Sn9Kl6te0PwM6oHK DZ8_urNM1NcN31vtARY.6CG_TGuglXon5O.LfZ6tXRy.5iEOe_zrqTj4QJr.gsXcJQFLDzZmelTO Uf082WPtCzPhmJf9wyVSh_iYk_4mRJ1KaOPUvgHLd8jlgIVW7SbvTd95QlQIFzoNXfIYiRjtdc_c vd_qbeuQ_LNJelUdtwAlUofqZgq.0Y7RGtuBIZDdIsXTIHBZCo8gOQ3dYQCaNeWUebj_NrQP02mV 5NQFkHr556NXNMrQaxPFCNI5665DXOz9GUb3W9XV.C6SbIDRvdJQvMfXW1KfFdggRg323bRJP9lt 1oES9qNXa2X3I_gT00j9ahnYZKd.ZQOtGrCh5wWpS90Vc2VQ7GYsLrfAILtA_02IssCWeS0m2EkS 9X9MBQ8RfLsyDb1j6LpFvnezHYHofIjJzpTgJZGfQNxKXRlgJ8BWlAjHhhNb6w_v9k64IzhObmvU bqRUYkmsoEP32P4irlxtbdbAWvVndD_93mlZDRv4jlaJ58MtLdaniBsWX6SZgkdlV5tZLtGIFXSW au.PDP8wnQLQp6e_Q52KTZDIC.nL9R8sDmGq1C6lbSjRPZlVCN1kfYn4x4o4BCpmIe2svaKUyzLF WeSoXpHrUKN7BMJ3PlVOayRb.GPcfS3omnbT_yc77NArnE9BbD6l9SmRrsF0fnYBgl.KSdEa7V0g dHdCrnwxDMNSctG_7GFguO820pZS2yuzUsZ7qdXT8KJhWMHBJFWbbqXcCDcevWzGjdefLBu31u34 fW.l_hRR.9HFvjA3p4I2WF4iDZrdKXATV.BK4kAuQmTdCsh7gECesEYuNG8Imh3Ta38hs.SW578X q5xOB4DNqvgql4pFFTeShhVodmLQNAGQ36EZhtPHocSQsaFFRDNY74fD5pMX4BA8TqlH3rPmtgjs fMZMXem5C7f7dfsn0CKidEHlUN9810Bvo1XM_dca.SZ6geqpqbOIpXVhecsQEOUtun1ZE5Xliy3N FBdU- Received: from sonic.gate.mail.ne1.yahoo.com by sonic305.consmr.mail.gq1.yahoo.com with HTTP; Mon, 13 May 2019 10:23:38 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.103]) ([76.115.7.162]) by smtp420.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID fa3b8fbd42382a889f92cc58fd3a1c3d; Mon, 13 May 2019 10:23:33 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: An experiment in PowerMac G5 multi-socket/multi-core having better matching mftb() values Message-Id: Date: Mon, 13 May 2019 03:23:33 -0700 To: Justin Hibbits , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: 7D64B72E13 X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.23 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.64)[0.638,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.70)[ip: (6.75), ipnet: 98.137.64.0/21(1.01), asn: 36647(0.81), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.67)[0.666,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.74)[0.736,0]; RCVD_IN_DNSWL_NONE(0.00)[82.64.137.98.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 May 2019 10:23:48 -0000 I've been experimenting with a alternate technique of dealing with boot-time 970 family PowerMac G5 tbr value synchronization across sockets/cores. So far it has narrowed the range significantly. I've reverted my hack for tolerating the mismatches in order to see how it goes. I'm not aware of other contexts having the threads-get-stuck-sleeping problem from the tbr mismatch scale that can happen as things are officially. And, if there are any, I've no environment to test. The technique definitely requires the relationship between the mftb() value changing rate and the time it takes to store-release/load-acq each way between an ap and the bsp to be such that the round trip time is reasonably measurable, with a useful combination of accuracy and precision. Thus there are limits to its generality if some other context attempted something analogous. Each ap does its own instance of the process. No single delta would work. It is also based on the expectation that the store-release/load-acq each way takes a non-trivial amount of the round trip time, putting the bsp's activity in the middle part of the round trip range. (Interrupts disabled around the relevant code.) What Ive seen suggests that this is true. I've included my exploratory code below. It is based on my head -r345758 context. The ap sends the bsp a mftb() value (that the bsp only used as a flag that it is time to send back its own mftb() value). The ap also calculates the approximate round trip ap-time for this exchange. =46rom such the ap comes up with the adjustment to its mftb() values to approximate the mftb() value the bsp provided. The ap then uses that as an adjustment to the ap's tbr value (via mttb()). So far the results seem to be a sizable improvement. [In experiments I've been labeling some variables volatile, just to indicate that I generally do not expect loads/stores to be skipped for them. This does not mean that I'd observed any cases of just holding a value in a register. This may produce minor text mismatches with other files not shown here.] # svnlite diff /usr/src/sys/powerpc/powermac/platform_powermac.c = /usr/src/sys/powerpc/powerpc/mp_machdep.c | more Index: /usr/src/sys/powerpc/powermac/platform_powermac.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/powermac/platform_powermac.c (revision = 345758) +++ /usr/src/sys/powerpc/powermac/platform_powermac.c (working copy) @@ -55,7 +55,7 @@ =20 #include "platform_if.h" =20 -extern void *ap_pcpu; +extern void * volatile ap_pcpu; =20 static int powermac_probe(platform_t); static int powermac_attach(platform_t); @@ -333,6 +333,10 @@ return (powermac_smp_fill_cpuref(cpuref, bsp)); } =20 +#ifdef __powerpc64__ +extern volatile int alternate_timebase_sync_style; +#endif + static int powermac_smp_start_cpu(platform_t plat, struct pcpu *pc) { @@ -366,6 +370,19 @@ } =20 ap_pcpu =3D pc; +#ifdef __powerpc64__ + switch (mfpvr()>>16) + { + case IBM970: + case IBM970FX: + case IBM970MP: + alternate_timebase_sync_style=3D 1; + break; + default: + break; + } +#endif + powerpc_sync(); =20 if (rstvec_virtbase =3D=3D NULL) rstvec_virtbase =3D pmap_mapdev(0x80000000, PAGE_SIZE); Index: /usr/src/sys/powerpc/powerpc/mp_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/powerpc/mp_machdep.c (revision 345758) +++ /usr/src/sys/powerpc/powerpc/mp_machdep.c (working copy) @@ -70,6 +70,13 @@ static struct mtx ap_boot_mtx; struct pcb stoppcbs[MAXCPU]; =20 +#if defined(__powerpc64__) && defined(AIM) +// Part of: Attempt a better-than-historical approximately equal = timebase value for ap vs. bsp +volatile int alternate_timebase_sync_style=3D 0; +volatile uint64_t timebase_samples[2]; // 0: from ap; 1: from bsp. + // Consider separate cache lines? +#endif + void machdep_ap_bootstrap(void) { @@ -77,19 +84,65 @@ PCPU_SET(awake, 1); __asm __volatile("msync; isync"); =20 +#if defined(__powerpc64__) && defined(AIM) + // Attempt a better-than-historical approximately equal timebase = value for ap vs. bsp + powerpc_sync(); + isync(); + if (alternate_timebase_sync_style) // Requires: timeframe with = only one ap at a time + { + register_t oldmsr=3D intr_disable(); + + while (1u!=3Dtimebase_samples[1]) + ; // spin waiting for bsp to flag that ready to = start. + + // Measure a round trip:: to the bsp and back. + + isync(); // Be sure below mftb() result is not from = earlier speculative execution. + atomic_store_rel_64(&timebase_samples[0], mftb()); // = bsp waits for this before its mftb(). + + while (1u=3D=3Dtimebase_samples[1]) // expect bsp to = have: 1upc_cpuid, = (uintmax_t)pc->pc_hwref, pc->pc_awake); smp_cpus++; + +#if defined(__powerpc64__) && defined(AIM) + // Part of: Attempt a better-than-historical = approximately + // equal timebase value for ap vs. bsp + powerpc_sync(); + isync(); + if (alternate_timebase_sync_style) + { + register_t oldmsr=3D intr_disable(); + + = atomic_store_rel_64(&timebase_samples[1], 1u); // flag ap that bsp is = ready to start. + + while (0u=3D=3Dtimebase_samples[0]) // = Expect on ap's: 0upc_cpuid, &stopped_cpus); } @@ -257,14 +338,22 @@ =20 ap_awake =3D 1; =20 - /* Provide our current DEC and TB values for APs */ - ap_timebase =3D mftb() + 10; - __asm __volatile("msync; isync"); +#if defined(__powerpc64__) && defined(AIM) + if (!alternate_timebase_sync_style) +#endif + { + /* Provide our current DEC and TB values for APs */ + ap_timebase =3D mftb() + 10; + __asm __volatile("msync; isync"); + } =20 /* Let APs continue */ atomic_store_rel_int(&ap_letgo, 1); =20 - platform_smp_timebase_sync(ap_timebase, 0); +#if defined(__powerpc64__) && defined(AIM) + if (!alternate_timebase_sync_style) +#endif + platform_smp_timebase_sync(ap_timebase, 0); =20 while (ap_awake < smp_cpus) ; =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)