FreeBSD Mail Archives

Date:      Sat, 2 Mar 2019 21:20:58 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Mark Millard via freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   powerpc64 on PowerMac G5 4-core (system total): a hack that so far seem to avoid the stuck-sleeping issue
Message-ID:  <B898BF60-2872-4FFC-AD72-A32591BC7D20@yahoo.com>

next in thread | raw e-mail | index | archive | help


[This note goes in a different direction compared to my
prior evidence report for overflows and the later activity
that has been happening for it. This does *not* involve
the patches associated with that report.]

I view the following as an evidence-gathering hack:
showing the change in behavior with the code changes,
not as directly what FreeBSD should do for powerpc64.
In code for defined(__powerpc64__) && defined(AIM)
I freely use knowledge of the PowerMac G5 context
instead of attempting general code.

Also: the code is set up to record some information
that I've been looking at via ddb. The recording is
not part of what changes the behavior but I decided
to show that code too.

It is preliminary, but, so far, the hack has avoided
buf*daemon* threads and pmac_thermal getting stuck
sleeping (or, at least, far less frequently).


The tbr-value hack:

From what I see the G5 various cores have each tbr running at the
same rate but have some some offsets as far as the base time
goes. cpu_mp_unleash does:

        ap_awake = 1;

        /* Provide our current DEC and TB values for APs */
        ap_timebase = mftb() + 10;
        __asm __volatile("msync; isync");

        /* Let APs continue */
        atomic_store_rel_int(&ap_letgo, 1);

        platform_smp_timebase_sync(ap_timebase, 0);

and machdep_ap_bootstrap does:

        /*
         * Set timebase as soon as possible to meet an implicit rendezvous
         * from cpu_mp_unleash(), which sets ap_letgo and then immediately
         * sets timebase.
         *
         * Note that this is instrinsically racy and is only relevant on
         * platforms that do not support better mechanisms.
         */
        platform_smp_timebase_sync(ap_timebase, 1);


which attempts to set the tbrs appropriately.

But on small scales of differences the various tbr
values from different cpus end up not well ordered
relative to time, synchronizes with, and the like.
Only large enough differences can well indicate an
ordering of interest.

Note: tc->tc_get_timecount(tc) only provides the
least signficant 32 bits of the tbr value.
th->th_offset_count is also 32 bits and based on
truncated tbr values.

So I made binuptime avoid finishing when it sees
a small (<0x10) step backwards for a new
tc->tc_get_timecount(tc) value vs. the existing
th->th_offset_count value (values strongly tied
to powerpc64 tbr values):

void
binuptime(struct bintime *bt)
{
        struct timehands *th;
        u_int gen;

        struct bintime old_bt= *bt; // HACK!!!
        struct timecounter *tc; // HACK!!!
        u_int tim_cnt, tim_offset, tim_diff; // HACK!!!
        uint64_t freq, scale_factor, diff_scaled; // HACK!!!

        u_int try_cnt= 0ull; // HACK!!!

        do {
                do { // HACK!!!
                    th = timehands;
                    tc = th->th_counter;
                    gen = atomic_load_acq_int(&th->th_generation);
                    tim_cnt= tc->tc_get_timecount(tc);
                    tim_offset= th->th_offset_count;
                } while (tim_cnt<tim_offset && tim_offset-tim_cnt<0x10);
                *bt = th->th_offset;
                tim_diff= (tim_cnt - tim_offset) & tc->tc_counter_mask;
                scale_factor= th->th_scale;
                diff_scaled= scale_factor * tim_diff;
                bintime_addx(bt, diff_scaled);
                freq= tc->tc_frequency;
                atomic_thread_fence_acq();
                try_cnt++;
        } while (gen == 0 || gen != th->th_generation);

        if (*(volatile uint64_t*)0xc000000000000020==0u && (0xffffffffffffffffull/scale_factor)<tim_diff) { // HACK!!!
                *(volatile uint64_t*)0xc000000000000020= bttosbt(old_bt);
                *(volatile uint64_t*)0xc000000000000028= bttosbt(*bt);
                *(volatile uint64_t*)0xc000000000000030= freq;
                *(volatile uint64_t*)0xc000000000000038= scale_factor;
                *(volatile uint64_t*)0xc000000000000040= tim_offset;
                *(volatile uint64_t*)0xc000000000000048= tim_cnt;
                *(volatile uint64_t*)0xc000000000000050= tim_diff;
                *(volatile uint64_t*)0xc000000000000058= try_cnt;
                *(volatile uint64_t*)0xc000000000000060= diff_scaled;
                *(volatile uint64_t*)0xc000000000000068= scale_factor*freq;
                __asm__ ("sync");
        } else if (*(volatile uint64_t*)0xc0000000000000a0==0u && (0xffffffffffffffffull/scale_factor)<tim_diff) { // HACK!!!
                *(volatile uint64_t*)0xc0000000000000a0= bttosbt(old_bt);
                *(volatile uint64_t*)0xc0000000000000a8= bttosbt(*bt);
                *(volatile uint64_t*)0xc0000000000000b0= freq;
                *(volatile uint64_t*)0xc0000000000000b8= scale_factor;
                *(volatile uint64_t*)0xc0000000000000c0= tim_offset;
                *(volatile uint64_t*)0xc0000000000000c8= tim_cnt;
                *(volatile uint64_t*)0xc0000000000000d0= tim_diff;
                *(volatile uint64_t*)0xc0000000000000d8= try_cnt;
                *(volatile uint64_t*)0xc0000000000000e0= diff_scaled;
                *(volatile uint64_t*)0xc0000000000000e8= scale_factor*freq;
                __asm__ ("sync");
        }
}
#else
. . .
#endif

So far as I can tell, the FreeBSD code is not designed to deal
with small differences in tc->tc_get_timecount(tc) not actually
indicating a useful < vs. == vs. > ordering relation uniquely.

(I make no claim that the hack is a proper way to deal with
such.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B898BF60-2872-4FFC-AD72-A32591BC7D20>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation