Date: Sun, 19 Jan 2020 14:22:21 -0800 From: Mark Millard <marklmi@yahoo.com> To: Francis Little <oggy@farscape.co.uk> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: G5 Quad Fans full speed after 1 min Message-ID: <FCED3FE8-0839-4A3D-B730-FE4A10B396A9@yahoo.com> In-Reply-To: <DC70B788-24D0-4578-8A57-BB8DA753CAD4@yahoo.com> References: <CAGSRtz76TC44UiECMepLAZjXE=H33sg4OFLJpmZpNCSEwE=ETQ@mail.gmail.com> <21533290-667C-472E-89F7-D1E7DE773193@yahoo.com> <DC70B788-24D0-4578-8A57-BB8DA753CAD4@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[I forgot to include something for the patches involved. Added this time.] On 2020-Jan-19, at 13:00, Mark Millard <marklmi at yahoo.com> wrote: > On 2020-Jan-19, at 03:13, Mark Millard <marklmi at yahoo.com> wrote: >>=20 >> On 2020-Jan-19, at 00:38, Francis Little <oggy at farscape.co.uk> = wrote: >>=20 >>> Hi, >>>=20 >>> My G5 Quad is running current from a few days ago, but this issue = has been >>> happening for a long time. >>>=20 >>> After about 1 min of uptime, the fans go full speed. >>>=20 >>> As soon as I query anything like CPU temp or fan rpm with sysctl, = the fans >>> return to a normal speed. >>>=20 >>> 1 min later the fans go full speed again. >>>=20 >>> I've been working round this for some time with a cron job that runs = sysctl >>> with one of the cpu temp sensors to calm the system. >>=20 >> QUOTING an old message: >> The mftb figures on the various cores can be so far apart that >> threads can end-up stuck sleeping, such as syncr, pmac_thermal, >> and buf*deamon* threads. (This can really mess things up by >> not updating the storage correctly.) Such is still true of the=20 >> ELFv2 context. >>=20 >> (Most folks notice this via shutdown timeouts and the fans >> going fast unnecessarily. But it is involved earlier as well.) >> END QUOTE >>=20 >> Nothing in the boot sequence is forcing the CPUs/Cores to >> see any particular time relationship to each other and on >> the multi-socket PowerMacs it can vary widely (G4 and G5). >> Sometimes it will happen to end up okay, other times not. >>=20 >> (I've no access to a single-socket, multi-core PowerMac, >> so I just do not know for that kind of context.) >>=20 >> I run with patched boot-code that has cross-cpu/core time >> observations and adjustments to non-bsp time to see the >> bsp time as between the start and end of a round trip to >> the bsp from each non-bsp to get the bsp's time. It is >> based on the mid-point of the start and end times for >> the non-bsp's round trip vs. the bsp's returned time. >> With at most 4 cores, each non-bsp is done in sequence. >> The code only does this on PowerMacs, having no access >> to other types of PowerPC examples to test. >>=20 >> I've not seen this type of problem since mid 2019-May on >> any of: >>=20 >> 1 G5 with 2 sockets, 1 core each >> 2 G5's, 2 sockets, 2 cores each >> 2 G4's, 2 sockets, 1 core each >>=20 >> (The mid-May time frame is when I adjusted the code to >> deal with the faster time increment on the slower >> 32-bit processors for the model that I have access to. >> I had to be more careful to avoid biasing the approximate >> symmetry to be less symmetric. On the G5's its been >> longer since I've seen this problem, based on earlier >> source code.) >>=20 >> Unfortunately the "lab" the machines are in is powered >> down currently. >>=20 >> FYI: Prior to this technique, I had a pure hack that >> was observed to avoid the problem. But it changed code >> used all the time --code that I did not want to have >> any hack's in if I could avoid it. >>=20 >> FYI: I also run with other PowerMac related patches. >> Generally this mftb() time adjustment is one of the >> newest patches, possibly the newest. So my test >> context may be biased by the other patches. >>=20 >>> If I boot to OS X 10.5 and load the system, the fans are stable. >>=20 >> I've not done any investigation of the issue for the >> older contexts. But, if I remember right, I did see >> the problem on occasion back in that time frame. >>=20 >>> Does anyone else get this? >>=20 >> My understanding is everyone booting a fairly modern >> standard FreeBSD gets this sometimes for the kind of >> context that you specified. (I'm not sure of the >> variability if the frequency of the problem happening >> for that kind of context.) >>=20 >> I certainly saw it before I investigated avoiding it. >=20 > I got access to the sources with the patches. > I'll cover both 32-bit and 64-bit for how I > have avoided the thread-stuck-sleeping problems, > including for pmac_thermal. >=20 >=20 > In the following you likely want to avoid my > use of ap_pcpu being volatile (not a pointer > to something volatile). I use volatile to > document that I do not want the code > generation to avoid storing or accessing > RAM for what is volatile. >=20 > -extern void *ap_pcpu; > +extern void * volatile ap_pcpu; >=20 > There is one example of this in the diff's > that I show. >=20 > I also put powerpc_sync(); soon after any > ap_pcpu =3D ... assignment. I do not show > here such declarations or powerpc_sync() > use that are not in the sources tied to > the mftb related time measurements and > adjustments at boot time. (ap_pcpu is > sometimes not alone in being a variable > that an added powerpc_sync() is for. An > example will occur below.) >=20 > The additional powerpc_sync() use is for > gathering evidence that a sufficient context > exists, not trying to be an example of a > minimalist sufficient implementation. isync() > use and possibly more have a similar status > here. >=20 >=20 > First: I'm paranoid about interrupts messing > up times, so I use a modified mttb(...): > (whitespace may not be well preserved) >=20 > Index: /usr/src/sys/powerpc/include/cpufunc.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/include/cpufunc.h (revision 356426) > +++ /usr/src/sys/powerpc/include/cpufunc.h (working copy) > @@ -155,15 +155,8 @@ > return (tb); > } >=20 > -static __inline void > -mttb(u_quad_t time) > -{ > +// mttb is listed after intr_disable and intr_restore. >=20 > - mtspr(TBR_TBWL, 0); > - mtspr(TBR_TBWU, (uint32_t)(time >> 32)); > - mtspr(TBR_TBWL, (uint32_t)(time & 0xffffffff)); > -} > - > static __inline void > eieio(void) > { > @@ -202,6 +195,19 @@ > mtmsr(msr); > } >=20 > +static __inline void > +mttb(u_quad_t time) > +{ > + const uint32_t high=3D time>>32; > + const uint32_t low=3D time&0xffffffffu; > + =20 > + const register_t predisable_msr=3D intr_disable(); > + mtspr(TBR_TBWL, 0); > + mtspr(TBR_TBWU, high); > + mtspr(TBR_TBWL, low); > + intr_restore(predisable_msr); > +} > + > static __inline struct pcpu * > get_pcpu(void) > { >=20 >=20 > I have platform_powermac.c enable what I call > alternate_timebase_sync_style. I have no place > else doing so: PowerMac's are my only PowerPC > test environment. >=20 > Be WARNED this code is one of various places > involved in changing ap_pcpu to be volatile > (not point to volatile). You likely do not want > that --and, if you use it, various other places > would need to track. Still, I show my code as > it is here. >=20 > Index: /usr/src/sys/powerpc/powermac/platform_powermac.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/powermac/platform_powermac.c (revision = 356426) > +++ /usr/src/sys/powerpc/powermac/platform_powermac.c (working copy) > @@ -55,7 +55,7 @@ >=20 > #include "platform_if.h" >=20 > -extern void *ap_pcpu; > +extern void * volatile ap_pcpu; >=20 > static int powermac_probe(platform_t); > static int powermac_attach(platform_t); > @@ -333,6 +333,9 @@ > return (powermac_smp_fill_cpuref(cpuref, bsp)); > } >=20 > +// platform_powermac.c is implicitly an AIM context: no explicit AIM = test. > +extern volatile int alternate_timebase_sync_style; // 0 indicates old = style; 1 indicates new style > + > static int > powermac_smp_start_cpu(platform_t plat, struct pcpu *pc) > { > @@ -367,6 +370,13 @@ >=20 > ap_pcpu =3D pc; >=20 > + // platform_powermac.c is implicitly an AIM context: no = explicit AIM test. > + // Part of: Attempt a better-than-historical approximately > + // equal timebase value for ap vs. bsp > + alternate_timebase_sync_style=3D 1; // So: new style for = PowerMacs > + > + powerpc_sync(); // for ap_pcpu and = alternate_timebase_sync_style > + > if (rstvec_virtbase =3D=3D NULL) > rstvec_virtbase =3D pmap_mapdev(0x80000000, PAGE_SIZE); >=20 >=20 >=20 > There is more use of intr_disable() in the code directly > targeting measuring and adjusting time: some of the > code in mp_machdep.c . The protocol for going back and > forth with the bsp here is based on memory flags. > machdep_ap_bootstrap and cpu_mp_unleash have comments > about the protocol and its use. >=20 > In setting this up I had to be careful to make code that > avoided optimizations from calculating various things > earlier than appropriate. There is a comment about that > in machdep_ap_bootstrap. >=20 > Index: /usr/src/sys/powerpc/powerpc/mp_machdep.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/powerpc/mp_machdep.c (revision 356426) > +++ /usr/src/sys/powerpc/powerpc/mp_machdep.c (working copy) > @@ -69,6 +69,88 @@ > static struct mtx ap_boot_mtx; > struct pcb stoppcbs[MAXCPU]; >=20 > +#if defined(AIM) > +// Part of: Attempt a better-than-historical approximately > +// equal timebase value for ap vs. bsp > + > +volatile int alternate_timebase_sync_style=3D 0; // 0 indicates old = style; 1 indicates new style. > +volatile uint64_t bsp_timebase_sample=3D 0u; > + > +volatile unsigned int from_bsp_status_flag=3D 0u; > +// stages: 0u, 1u (bsp ready to start), 2u (bsp tbr value available = to ap) > + > +volatile unsigned int from_ap_status_flag=3D 0u; > +// stages: 0u, 1u (ap ready for bsp tbr value to be found and sent) > +#endif > + > +static __inline uint64_t > +mftb_with_no_pointer_use(void) > +{ > +#ifdef __powerpc64__ > + uint64_t tb; // not used for 32-bit powerpc > + __asm __volatile ("mftb %0" : "=3Dr"(tb)); > + return tb; > +#else > + uint32_t tbu; // not pointer into tb > + uint32_t tbl; // not pointer into tb > + > + do { > + tbu=3D mfspr(TBR_TBU); > + tbl=3D mfspr(TBR_TBL); > + } while (tbu !=3D mfspr(TBR_TBU)); > + > + // The construction of the unint64_t value does bias the mttb = some > + // for the round-trip-start side of things. > + // > + // The pointers into tb technique would involve a pair of = memory > + // writes and a pair of memory reads instead, the writes being > + // in the loop. > + return ((uint64_t)tbu<<32) | tbl; > +#endif > +} > + > +static __inline uint64_t > +mftb_plus_delta(volatile uint64_t* bsp_tbr, int64_t ap_midpoint) > + // The return value is used in the mttb as the argument. > +{ =20 > +#ifdef __powerpc64__ > + uint64_t tb; // not used for 32-bit powerpc > + __asm __volatile ("mftb %0" : "=3Dr"(tb)); > + // The construction of the unint64_t value does bias the mttb = some: > + // it assignes an earlier time than hoped for, given these = later > + // calculations. > + return tb + ((int64_t)*bsp_tbr - ap_midpoint); > +#else > + // Establishes delta_for_approx_match_to_bsp_tbr_values such = that: > + // = ap_midpoint+delta_for_approx_match_to_bsp_tbr_values=3D=3D*bsp_tbr > + int64_t delta_for_approx_match_to_bsp_tbr_values; > + uint32_t tbu; // not pointer into tb > + uint32_t tbl; // not pointer into tb > + > + do { > + // The below in-loop style is for avoiding the loop > + // vs. ap_midpoint's calculation being reversed in > + // the code generated: volatile is is being put to > + // use here. > + delta_for_approx_match_to_bsp_tbr_values=3D = (int64_t)*bsp_tbr-ap_midpoint; > + > + tbu=3D mfspr(TBR_TBU); > + tbl=3D mfspr(TBR_TBL); > + } while (tbu !=3D mfspr(TBR_TBU)); > + > + // The construction of the unint64_t value does bias the mttb = some: > + // it assignes an earlier time than hoped for, given these = later > + // calculations. Easily observable on the example 7455 based = PowerMac > + // G4. (Faster than G5 tbr increment rate but a slower = processor,) > + // But the overall process is still an improvement. > + // > + // The pointers into tb technique would involve a pair of = memory > + // writes and a pair of memory reads instead, the writes being > + // in the loop. The "+ . . ." would still be involved. > + return ( ((uint64_t)tbu<<32) | tbl ) + = delta_for_approx_match_to_bsp_tbr_values; > +#endif > +} > + > void > machdep_ap_bootstrap(void) > { > @@ -76,19 +158,76 @@ > PCPU_SET(awake, 1); > __asm __volatile("msync; isync"); >=20 > +#if defined(AIM) > + powerpc_sync(); > + isync(); > + if (1=3D=3Dalternate_timebase_sync_style) > + { > + // Part of: Attempt a better-than-historical = approximately > + // equal timebase value for ap vs. bsp > + > + // No claim to deal with overflow/wraparound of tbr, = or even > + // of the upper bit being on. > + > + register_t oldmsr=3D intr_disable(); > + > + while (1u!=3Dfrom_bsp_status_flag) > + ; // spin waiting for bsp to flag that its = ready to start. > + > + // Start to measure a round trip:: to the bsp and = back. > + > + isync(); // Be sure below mftb() result is not from = earlier speculative execution. > + uint64_t const start_round_trip_time_on_ap=3D = mftb_with_no_pointer_use(); > + atomic_store_rel_int(&from_ap_status_flag, 1u); // bsp = waits for such before its mftb(). > + > + while (2u!=3Dfrom_bsp_status_flag) > + ; // spin waiting for bsp's tbr value > + > + // Mid-point of ap round trip and the bsp timebase = value should be approximately equal > + // when the tbr's are well matched, absent = interruptions on both sides. > + > + isync(); // Be sure below mftb() result is not from = earlier speculative execution. > + uint64_t const end_round_trip_time_on_ap=3D = mftb_with_no_pointer_use(); > + isync(); // Be sure above mftb() result is not from = overlapping with the following. > + > + int64_t const approx_round_trip_tbr_delta_on_ap > + =3D (int64_t)end_round_trip_time_on_ap = - (int64_t)start_round_trip_time_on_ap; > + int64_t const ap_midpoint_value > + =3D = (int64_t)start_round_trip_time_on_ap + = approx_round_trip_tbr_delta_on_ap/2; > + > + // The mftb_plus_delta use is for helping to the = control the code order relative > + // to tbr access. Such issues are notable for the 7455 = based 2-socket PowerMacs, > + // for example. Faster tbr increment rate than the = G5's but slower processors > + // and such. Still, overall this definitely helps such = contexts compared to the > + // historical style of timebase synchronization. > + isync(); // Be sure below mftb() result is not from = earlier speculative execution. > + = mttb(mftb_plus_delta(&bsp_timebase_sample,ap_midpoint_value)); > + > + atomic_store_rel_int(&from_bsp_status_flag, 0u); // = Get ready for next ap in bsp loop > + atomic_store_rel_int(&from_ap_status_flag, 0u); // = Flag bsp that this ap is done > + > + mtmsr(oldmsr); > + } > +#endif > + > while (ap_letgo =3D=3D 0) > nop_prio_vlow(); > nop_prio_medium(); >=20 > - /* > - * Set timebase as soon as possible to meet an implicit = rendezvous > - * from cpu_mp_unleash(), which sets ap_letgo and then = immediately > - * sets timebase. > - * > - * Note that this is instrinsically racy and is only relevant = on > - * platforms that do not support better mechanisms. > - */ > - platform_smp_timebase_sync(ap_timebase, 1); > +#if defined(AIM) > + if (0=3D=3Dalternate_timebase_sync_style) > +#endif > + { > + /* > + * Set timebase as soon as possible to meet an = implicit rendezvous > + * from cpu_mp_unleash(), which sets ap_letgo and then = immediately > + * sets timebase. > + * > + * Note that this is instrinsically racy and is only = relevant on > + * platforms that do not support better mechanisms. > + */ > + platform_smp_timebase_sync(ap_timebase, 1); > + } >=20 > /* Give platform code a chance to do anything else necessary */ > platform_smp_ap_init(); > @@ -261,6 +400,34 @@ > pc->pc_cpuid, = (uintmax_t)pc->pc_hwref, > pc->pc_awake); > smp_cpus++; > + > +#if defined(AIM) > + // Part of: Attempt a better-than-historical = approximately > + // equal timebase value for ap vs. = bsp > + powerpc_sync(); > + isync(); > + if (1=3D=3Dalternate_timebase_sync_style) > + { > + register_t oldmsr=3D intr_disable(); > + > + = atomic_store_rel_int(&from_bsp_status_flag, 1u); // bsp ready to start. > + > + while (1u!=3Dfrom_ap_status_flag) > + ; // spin waiting for ap to = flag: time to send a tbr. > + > + isync(); // Be sure below mftb() = result is not from earlier. > + bsp_timebase_sample=3D = mftb_with_no_pointer_use(); > + = atomic_store_rel_int(&from_bsp_status_flag, 2u); // bsp tbr available. > + > + // Most of the rest of the usage is in = machdep_ap_bootstrap, > + // other than controling = alternate_timebase_sync_style value. > + > + while (0u!=3Dfrom_ap_status_flag) > + ; // spin waiting for ap to be = done with the sample. > + > + mtmsr(oldmsr); > + } > +#endif > } else > CPU_SET(pc->pc_cpuid, &stopped_cpus); > } > @@ -267,14 +434,22 @@ >=20 > ap_awake =3D 1; >=20 > - /* Provide our current DEC and TB values for APs */ > - ap_timebase =3D mftb() + 10; > - __asm __volatile("msync; isync"); > +#if defined(AIM) > + if (0=3D=3Dalternate_timebase_sync_style) > +#endif > + { > + /* Provide our current DEC and TB values for APs */ > + ap_timebase =3D mftb() + 10; > + __asm __volatile("msync; isync"); > + } >=20 > /* Let APs continue */ > atomic_store_rel_int(&ap_letgo, 1); >=20 > - platform_smp_timebase_sync(ap_timebase, 0); > +#if defined(AIM) > + if (0=3D=3Dalternate_timebase_sync_style) > +#endif > + platform_smp_timebase_sync(ap_timebase, 0); >=20 > while (ap_awake < smp_cpus) > ; >=20 I forgot to include a patch that was tied to some PowerPC reference material's notes about the TB register for at least some PowerPC family members: "The TB is a volatile resource and must be initialized during reset." Note: Here the ap_cpu was simply deleted because it was not used anywhere in the file. Index: /usr/src/sys/powerpc/powerpc/machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/powerpc/machdep.c (revision 356426) +++ /usr/src/sys/powerpc/powerpc/machdep.c (working copy) @@ -142,8 +142,6 @@ extern vm_paddr_t kernload; #endif =20 -extern void *ap_pcpu; - struct pcpu __pcpu[MAXCPU]; static char init_kenv[2048]; @@ -283,9 +281,12 @@ cpu_feature_setup(); #ifdef AIM + // May restart the kernel at __start and get back here again. aim_early_init(fdt, toc, ofentry, mdp, mdp_cookie); #endif + mttb(0); // "The TB is a volatile resource and must be = initialized during reset." + /* * Parse metadata if present and fetch parameters. Must be done * before console is inited so cninit gets the right value of =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FCED3FE8-0839-4A3D-B730-FE4A10B396A9>