Date: Sun, 19 Jan 2020 13:00:54 -0800 From: Mark Millard <marklmi@yahoo.com> To: Francis Little <oggy@farscape.co.uk> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: G5 Quad Fans full speed after 1 min Message-ID: <DC70B788-24D0-4578-8A57-BB8DA753CAD4@yahoo.com> In-Reply-To: <21533290-667C-472E-89F7-D1E7DE773193@yahoo.com> References: <CAGSRtz76TC44UiECMepLAZjXE=H33sg4OFLJpmZpNCSEwE=ETQ@mail.gmail.com> <21533290-667C-472E-89F7-D1E7DE773193@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Jan-19, at 03:13, Mark Millard <marklmi at yahoo.com> wrote: >=20 > On 2020-Jan-19, at 00:38, Francis Little <oggy at farscape.co.uk> = wrote: >=20 >> Hi, >>=20 >> My G5 Quad is running current from a few days ago, but this issue has = been >> happening for a long time. >>=20 >> After about 1 min of uptime, the fans go full speed. >>=20 >> As soon as I query anything like CPU temp or fan rpm with sysctl, the = fans >> return to a normal speed. >>=20 >> 1 min later the fans go full speed again. >>=20 >> I've been working round this for some time with a cron job that runs = sysctl >> with one of the cpu temp sensors to calm the system. >=20 > QUOTING an old message: > The mftb figures on the various cores can be so far apart that > threads can end-up stuck sleeping, such as syncr, pmac_thermal, > and buf*deamon* threads. (This can really mess things up by > not updating the storage correctly.) Such is still true of the=20 > ELFv2 context. >=20 > (Most folks notice this via shutdown timeouts and the fans > going fast unnecessarily. But it is involved earlier as well.) > END QUOTE >=20 > Nothing in the boot sequence is forcing the CPUs/Cores to > see any particular time relationship to each other and on > the multi-socket PowerMacs it can vary widely (G4 and G5). > Sometimes it will happen to end up okay, other times not. >=20 > (I've no access to a single-socket, multi-core PowerMac, > so I just do not know for that kind of context.) >=20 > I run with patched boot-code that has cross-cpu/core time > observations and adjustments to non-bsp time to see the > bsp time as between the start and end of a round trip to > the bsp from each non-bsp to get the bsp's time. It is > based on the mid-point of the start and end times for > the non-bsp's round trip vs. the bsp's returned time. > With at most 4 cores, each non-bsp is done in sequence. > The code only does this on PowerMacs, having no access > to other types of PowerPC examples to test. >=20 > I've not seen this type of problem since mid 2019-May on > any of: >=20 > 1 G5 with 2 sockets, 1 core each > 2 G5's, 2 sockets, 2 cores each > 2 G4's, 2 sockets, 1 core each >=20 > (The mid-May time frame is when I adjusted the code to > deal with the faster time increment on the slower > 32-bit processors for the model that I have access to. > I had to be more careful to avoid biasing the approximate > symmetry to be less symmetric. On the G5's its been > longer since I've seen this problem, based on earlier > source code.) >=20 > Unfortunately the "lab" the machines are in is powered > down currently. >=20 > FYI: Prior to this technique, I had a pure hack that > was observed to avoid the problem. But it changed code > used all the time --code that I did not want to have > any hack's in if I could avoid it. >=20 > FYI: I also run with other PowerMac related patches. > Generally this mftb() time adjustment is one of the > newest patches, possibly the newest. So my test > context may be biased by the other patches. >=20 >> If I boot to OS X 10.5 and load the system, the fans are stable. >=20 > I've not done any investigation of the issue for the > older contexts. But, if I remember right, I did see > the problem on occasion back in that time frame. >=20 >> Does anyone else get this? >=20 > My understanding is everyone booting a fairly modern > standard FreeBSD gets this sometimes for the kind of > context that you specified. (I'm not sure of the > variability if the frequency of the problem happening > for that kind of context.) >=20 > I certainly saw it before I investigated avoiding it. I got access to the sources with the patches. I'll cover both 32-bit and 64-bit for how I have avoided the thread-stuck-sleeping problems, including for pmac_thermal. In the following you likely want to avoid my use of ap_pcpu being volatile (not a pointer to something volatile). I use volatile to document that I do not want the code generation to avoid storing or accessing RAM for what is volatile. -extern void *ap_pcpu; +extern void * volatile ap_pcpu; There is one example of this in the diff's that I show. I also put powerpc_sync(); soon after any ap_pcpu =3D ... assignment. I do not show here such declarations or powerpc_sync() use that are not in the sources tied to the mftb related time measurements and adjustments at boot time. (ap_pcpu is sometimes not alone in being a variable that an added powerpc_sync() is for. An example will occur below.) The additional powerpc_sync() use is for gathering evidence that a sufficient context exists, not trying to be an example of a minimalist sufficient implementation. isync() use and possibly more have a similar status here. First: I'm paranoid about interrupts messing up times, so I use a modified mttb(...): (whitespace may not be well preserved) Index: /usr/src/sys/powerpc/include/cpufunc.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/include/cpufunc.h (revision 356426) +++ /usr/src/sys/powerpc/include/cpufunc.h (working copy) @@ -155,15 +155,8 @@ return (tb); } -static __inline void -mttb(u_quad_t time) -{ +// mttb is listed after intr_disable and intr_restore. =20 - mtspr(TBR_TBWL, 0); - mtspr(TBR_TBWU, (uint32_t)(time >> 32)); - mtspr(TBR_TBWL, (uint32_t)(time & 0xffffffff)); -} - static __inline void eieio(void) { @@ -202,6 +195,19 @@ mtmsr(msr); } =20 +static __inline void +mttb(u_quad_t time) +{ + const uint32_t high=3D time>>32; + const uint32_t low=3D time&0xffffffffu; + =20 + const register_t predisable_msr=3D intr_disable(); + mtspr(TBR_TBWL, 0); + mtspr(TBR_TBWU, high); + mtspr(TBR_TBWL, low); + intr_restore(predisable_msr); +} + static __inline struct pcpu * get_pcpu(void) { I have platform_powermac.c enable what I call alternate_timebase_sync_style. I have no place else doing so: PowerMac's are my only PowerPC test environment. Be WARNED this code is one of various places involved in changing ap_pcpu to be volatile (not point to volatile). You likely do not want that --and, if you use it, various other places would need to track. Still, I show my code as it is here. Index: /usr/src/sys/powerpc/powermac/platform_powermac.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/powermac/platform_powermac.c (revision = 356426) +++ /usr/src/sys/powerpc/powermac/platform_powermac.c (working copy) @@ -55,7 +55,7 @@ #include "platform_if.h" -extern void *ap_pcpu; +extern void * volatile ap_pcpu; static int powermac_probe(platform_t); static int powermac_attach(platform_t); @@ -333,6 +333,9 @@ return (powermac_smp_fill_cpuref(cpuref, bsp)); } +// platform_powermac.c is implicitly an AIM context: no explicit AIM = test. +extern volatile int alternate_timebase_sync_style; // 0 indicates old = style; 1 indicates new style + static int powermac_smp_start_cpu(platform_t plat, struct pcpu *pc) { @@ -367,6 +370,13 @@ ap_pcpu =3D pc; + // platform_powermac.c is implicitly an AIM context: no explicit = AIM test. + // Part of: Attempt a better-than-historical approximately + // equal timebase value for ap vs. bsp + alternate_timebase_sync_style=3D 1; // So: new style for = PowerMacs + + powerpc_sync(); // for ap_pcpu and alternate_timebase_sync_style + if (rstvec_virtbase =3D=3D NULL) rstvec_virtbase =3D pmap_mapdev(0x80000000, PAGE_SIZE); There is more use of intr_disable() in the code directly targeting measuring and adjusting time: some of the code in mp_machdep.c . The protocol for going back and forth with the bsp here is based on memory flags. machdep_ap_bootstrap and cpu_mp_unleash have comments about the protocol and its use. In setting this up I had to be careful to make code that avoided optimizations from calculating various things earlier than appropriate. There is a comment about that in machdep_ap_bootstrap. Index: /usr/src/sys/powerpc/powerpc/mp_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/powerpc/mp_machdep.c (revision 356426) +++ /usr/src/sys/powerpc/powerpc/mp_machdep.c (working copy) @@ -69,6 +69,88 @@ static struct mtx ap_boot_mtx; struct pcb stoppcbs[MAXCPU]; +#if defined(AIM) +// Part of: Attempt a better-than-historical approximately +// equal timebase value for ap vs. bsp + +volatile int alternate_timebase_sync_style=3D 0; // 0 indicates old = style; 1 indicates new style. +volatile uint64_t bsp_timebase_sample=3D 0u; + +volatile unsigned int from_bsp_status_flag=3D 0u; +// stages: 0u, 1u (bsp ready to start), 2u (bsp tbr value available to = ap) + +volatile unsigned int from_ap_status_flag=3D 0u; +// stages: 0u, 1u (ap ready for bsp tbr value to be found and sent) +#endif + +static __inline uint64_t +mftb_with_no_pointer_use(void) +{ +#ifdef __powerpc64__ + uint64_t tb; // not used for 32-bit powerpc + __asm __volatile ("mftb %0" : "=3Dr"(tb)); + return tb; +#else + uint32_t tbu; // not pointer into tb + uint32_t tbl; // not pointer into tb + + do { + tbu=3D mfspr(TBR_TBU); + tbl=3D mfspr(TBR_TBL); + } while (tbu !=3D mfspr(TBR_TBU)); + + // The construction of the unint64_t value does bias the mttb = some + // for the round-trip-start side of things. + // + // The pointers into tb technique would involve a pair of memory + // writes and a pair of memory reads instead, the writes being + // in the loop. + return ((uint64_t)tbu<<32) | tbl; +#endif +} + +static __inline uint64_t +mftb_plus_delta(volatile uint64_t* bsp_tbr, int64_t ap_midpoint) + // The return value is used in the mttb as the argument. +{ =20 +#ifdef __powerpc64__ + uint64_t tb; // not used for 32-bit powerpc + __asm __volatile ("mftb %0" : "=3Dr"(tb)); + // The construction of the unint64_t value does bias the mttb = some: + // it assignes an earlier time than hoped for, given these later + // calculations. + return tb + ((int64_t)*bsp_tbr - ap_midpoint); +#else + // Establishes delta_for_approx_match_to_bsp_tbr_values such = that: + // = ap_midpoint+delta_for_approx_match_to_bsp_tbr_values=3D=3D*bsp_tbr + int64_t delta_for_approx_match_to_bsp_tbr_values; + uint32_t tbu; // not pointer into tb + uint32_t tbl; // not pointer into tb + + do { + // The below in-loop style is for avoiding the loop + // vs. ap_midpoint's calculation being reversed in + // the code generated: volatile is is being put to + // use here. + delta_for_approx_match_to_bsp_tbr_values=3D = (int64_t)*bsp_tbr-ap_midpoint; + + tbu=3D mfspr(TBR_TBU); + tbl=3D mfspr(TBR_TBL); + } while (tbu !=3D mfspr(TBR_TBU)); + + // The construction of the unint64_t value does bias the mttb = some: + // it assignes an earlier time than hoped for, given these later + // calculations. Easily observable on the example 7455 based = PowerMac + // G4. (Faster than G5 tbr increment rate but a slower = processor,) + // But the overall process is still an improvement. + // + // The pointers into tb technique would involve a pair of memory + // writes and a pair of memory reads instead, the writes being + // in the loop. The "+ . . ." would still be involved. + return ( ((uint64_t)tbu<<32) | tbl ) + = delta_for_approx_match_to_bsp_tbr_values; +#endif +} + void machdep_ap_bootstrap(void) { @@ -76,19 +158,76 @@ PCPU_SET(awake, 1); __asm __volatile("msync; isync"); +#if defined(AIM) + powerpc_sync(); + isync(); + if (1=3D=3Dalternate_timebase_sync_style) + { + // Part of: Attempt a better-than-historical = approximately + // equal timebase value for ap vs. bsp + + // No claim to deal with overflow/wraparound of tbr, or = even + // of the upper bit being on. + + register_t oldmsr=3D intr_disable(); + + while (1u!=3Dfrom_bsp_status_flag) + ; // spin waiting for bsp to flag that its ready = to start. + + // Start to measure a round trip:: to the bsp and back. + + isync(); // Be sure below mftb() result is not from = earlier speculative execution. + uint64_t const start_round_trip_time_on_ap=3D = mftb_with_no_pointer_use(); + atomic_store_rel_int(&from_ap_status_flag, 1u); // bsp = waits for such before its mftb(). + + while (2u!=3Dfrom_bsp_status_flag) + ; // spin waiting for bsp's tbr value + + // Mid-point of ap round trip and the bsp timebase value = should be approximately equal + // when the tbr's are well matched, absent interruptions = on both sides. + + isync(); // Be sure below mftb() result is not from = earlier speculative execution. + uint64_t const end_round_trip_time_on_ap=3D = mftb_with_no_pointer_use(); + isync(); // Be sure above mftb() result is not from = overlapping with the following. + + int64_t const approx_round_trip_tbr_delta_on_ap + =3D (int64_t)end_round_trip_time_on_ap - = (int64_t)start_round_trip_time_on_ap; + int64_t const ap_midpoint_value + =3D (int64_t)start_round_trip_time_on_ap = + approx_round_trip_tbr_delta_on_ap/2; + + // The mftb_plus_delta use is for helping to the control = the code order relative + // to tbr access. Such issues are notable for the 7455 = based 2-socket PowerMacs, + // for example. Faster tbr increment rate than the G5's = but slower processors + // and such. Still, overall this definitely helps such = contexts compared to the + // historical style of timebase synchronization. + isync(); // Be sure below mftb() result is not from = earlier speculative execution. + = mttb(mftb_plus_delta(&bsp_timebase_sample,ap_midpoint_value)); + + atomic_store_rel_int(&from_bsp_status_flag, 0u); // Get = ready for next ap in bsp loop + atomic_store_rel_int(&from_ap_status_flag, 0u); // Flag = bsp that this ap is done + + mtmsr(oldmsr); + } +#endif + while (ap_letgo =3D=3D 0) nop_prio_vlow(); nop_prio_medium(); - /* - * Set timebase as soon as possible to meet an implicit = rendezvous - * from cpu_mp_unleash(), which sets ap_letgo and then = immediately - * sets timebase. - * - * Note that this is instrinsically racy and is only relevant on - * platforms that do not support better mechanisms. - */ - platform_smp_timebase_sync(ap_timebase, 1); +#if defined(AIM) + if (0=3D=3Dalternate_timebase_sync_style) +#endif + { + /* + * Set timebase as soon as possible to meet an implicit = rendezvous + * from cpu_mp_unleash(), which sets ap_letgo and then = immediately + * sets timebase. + * + * Note that this is instrinsically racy and is only = relevant on + * platforms that do not support better mechanisms. + */ + platform_smp_timebase_sync(ap_timebase, 1); + } /* Give platform code a chance to do anything else necessary */ platform_smp_ap_init(); @@ -261,6 +400,34 @@ pc->pc_cpuid, = (uintmax_t)pc->pc_hwref, pc->pc_awake); smp_cpus++; + +#if defined(AIM) + // Part of: Attempt a better-than-historical = approximately + // equal timebase value for ap vs. bsp + powerpc_sync(); + isync(); + if (1=3D=3Dalternate_timebase_sync_style) + { + register_t oldmsr=3D intr_disable(); + + = atomic_store_rel_int(&from_bsp_status_flag, 1u); // bsp ready to start. + + while (1u!=3Dfrom_ap_status_flag) + ; // spin waiting for ap to = flag: time to send a tbr. + + isync(); // Be sure below mftb() result = is not from earlier. + bsp_timebase_sample=3D = mftb_with_no_pointer_use(); + = atomic_store_rel_int(&from_bsp_status_flag, 2u); // bsp tbr available. + + // Most of the rest of the usage is in = machdep_ap_bootstrap, + // other than controling = alternate_timebase_sync_style value. + + while (0u!=3Dfrom_ap_status_flag) + ; // spin waiting for ap to be = done with the sample. + + mtmsr(oldmsr); + } +#endif } else CPU_SET(pc->pc_cpuid, &stopped_cpus); } @@ -267,14 +434,22 @@ ap_awake =3D 1; - /* Provide our current DEC and TB values for APs */ - ap_timebase =3D mftb() + 10; - __asm __volatile("msync; isync"); +#if defined(AIM) + if (0=3D=3Dalternate_timebase_sync_style) +#endif + { + /* Provide our current DEC and TB values for APs */ + ap_timebase =3D mftb() + 10; + __asm __volatile("msync; isync"); + } /* Let APs continue */ atomic_store_rel_int(&ap_letgo, 1); - platform_smp_timebase_sync(ap_timebase, 0); +#if defined(AIM) + if (0=3D=3Dalternate_timebase_sync_style) +#endif + platform_smp_timebase_sync(ap_timebase, 0); while (ap_awake < smp_cpus) ; =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DC70B788-24D0-4578-8A57-BB8DA753CAD4>