From owner-freebsd-ppc@freebsd.org Fri May 3 21:52:12 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95A15159B86A for ; Fri, 3 May 2019 21:52:12 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic314-20.consmr.mail.gq1.yahoo.com (sonic314-20.consmr.mail.gq1.yahoo.com [98.137.69.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 894F58A0E2 for ; Fri, 3 May 2019 21:52:11 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 8WvW7UsVM1mZp8QslV5xPpH7DhnAAK_sj0bPxUUwVpgSCtoPlefMBS4AEH627bQ kmR125k9CLj_LMdhJme.q0p6qMwb3HjGtT82v7MDjsMMlcpGG2DmpwC5qKF.XERCdnuIzRngkCJb ksqmp5yDBt4u3NcXFJvmiaIK9MSL6c2NhAH2iJznrERthTwd5g41gVkj38pmmEYJbuPvhJCOCwXr Zqtt_X.GvlyI761u7jLc_hijNzqSpnaCwZzTnjRMKZ7DabMroDHsjFDsTjBxnRH8ywt8NukE7gs7 kqA8p1.hpe.Vvu9tt_4odhqC0myQ5YNWJfSdtCKXJplN5d2tgprqrNrQkcPd6naVSx9yrzxyseRA ds54sk0ghU904sjMMnSR_JSjnBqwlH.ZLafOdcbj9.2cM1WD8NP23E1R3vPpIDIW0jHmxsT8bOG7 EMgG6xwE2RAwUhwQmXQgw_sNhRFu7ZBMcm6IiUJAxbV1GnQiqW7EqdWgCTRDBeES3K7aL3qsF19E rWFxT4MIV5ULZ0jY3vlpy4gJ2dIGCFbEq0qi2fZ81DdqOVlWI.zndSJ1HkFO55dIsRgAyNIel.6U hYVaflIXA6_8Dy2yssWPmR2DkCDXgoXv79Hz4Kp2EX3g3IaHvRYLs9Jw0vjMOu._OxYC57n1P4uD tv15_Y6y4sesi6AnPFbckJGXIcZgb32kacUeku20sV3OayV2LHz_EqBLW1.bu2JUCFl0ieM_5C.1 bBmIV9tT_Ei6mdIUYF0KRaY0Hoh6jLlHh9L2V3.w_VhMLNvdZ_lUPbimLf6obK4qGFZ70FPMIsqW gZFoRMOo3y2Mrl5dyvJfgQcyGzcqwwM2QE5A4hkJelLWgjQPBwluY5gGE4RP5wG37XgdPcgHomrC LtAUDV9ozdgcu12ZnROf8S.I0TEYz4XT8Ckd.W0frnmiQiRMl.8zNRQKgrwfw3vVXSDBPyaaOdNB LqE.SknSTAjLjOPifXJK7ui5oPHIoTcRgMyBIRtZysdMPHh7C11C4RVviIQxMGBqDU_eQ2VyEBNo 5UU7dwuQiPPrdgfT6nBTeDCsG.bD3RGX.7sIgRzUtWx2vxdouK_dqMHHrlaUApg49mwgo3Fvhzfb PsJNAhZJs2.UwG3UMZSGureCUdB4SyGiBrmVvcRZUsL0eegNTg3hpti.Qr8slReiEMdhJ0IcpwPX 4F.U- Received: from sonic.gate.mail.ne1.yahoo.com by sonic314.consmr.mail.gq1.yahoo.com with HTTP; Fri, 3 May 2019 21:52:06 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.103]) ([76.115.7.162]) by smtp410.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 65e8b1fde11245f8d6aca0e3659d2b95; Fri, 03 May 2019 21:52:03 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: 970MP PowerMac G5s: What printf's show about cpu_mp_unleash hangups on the test variant of head -r347003 (found cpu_mp_unleash counterexample) Date: Fri, 3 May 2019 14:52:02 -0700 References: <4D659851-8731-4116-A6B6-33A75E9F0B76@yahoo.com> To: Justin Hibbits , FreeBSD PowerPC ML In-Reply-To: <4D659851-8731-4116-A6B6-33A75E9F0B76@yahoo.com> Message-Id: <9B05FB92-D1E5-4DD5-BB04-53088E528F15@yahoo.com> X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: 894F58A0E2 X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.28 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.52)[0.519,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.70)[ip: (7.00), ipnet: 98.137.64.0/21(0.87), asn: 36647(0.70), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.89)[0.889,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.68)[0.678,0]; RCVD_IN_DNSWL_NONE(0.00)[83.69.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[83.69.137.98.rep.mailspike.net : 127.0.0.17] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 May 2019 21:52:13 -0000 [I've tried a different moea64_bootstrap_slb_prefault usage variation. Also, I reverted all the Rx,Rx,Rx no-op changes. I put in the change to protect the mttb(...) implementation from interrupts messing up the assignment to the TBR, just so that would not be a worry. I still have my isync additions and my libc string compare code changes in what I'm working with for head -r347003 .] On 2019-May-3, at 11:09, Mark Millard wrote: > [Exeriment with instead disabling the code in each of > the nop_prio_ routines instead of commenting out specific > calls. But mostly: more test runs. It does not support > what I thought yesterday's cpu_mp_unlead suggested.] >=20 > On 2019-May-2, at 22:23, Mark Millard wrote: >=20 >> [Note: I still have your requested loop change, my >> isync additions, and my libc string compare code >> change in what I'm working with for head -r347003 .] >>=20 >> I started using printf to help identify more about what >> code managed to execute vs what code did not for >> hang-ups. >>=20 >> This note is just about cpu_mp_unleash observations and >> experiments related to what printf's showed. >>=20 >> I did: >>=20 >> static void >> cpu_mp_unleash(void *dummy) >> { >> . . . (omitted as all earlier printf's printed) . . . >> printf("cpu_mp_unleash: before DELAY\n"); >> /* Let the APs get into the scheduler */ >> DELAY(10000); >> printf("cpu_mp_unleash: after DELAY\n"); >>=20 >> } >>=20 >> What I saw was only the first of the twoDEALY printf's >> shown above was printing when cpu_mp_unleash hung up, >> such a hangup being the normal case when vt_upgrade >> did not hang-up first. >>=20 >> So I looked at /mnt/usr/src/sys/powerpc/powerpc/clock.c >> and its DELAY routine and came up with only one thing >> that looked like a useful experiment. Note what I >> then commented out: >>=20 >> # svnlite diff /mnt/usr/src/sys/powerpc/powerpc/clock.c >> Index: /mnt/usr/src/sys/powerpc/powerpc/clock.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /mnt/usr/src/sys/powerpc/powerpc/clock.c (revision 347003) >> +++ /mnt/usr/src/sys/powerpc/powerpc/clock.c (working copy) >> @@ -309,10 +309,10 @@ >> TSENTER(); >> tb =3D mftb(); >> ttb =3D tb + howmany((uint64_t)n * 1000000, ps_per_tick); >> - nop_prio_vlow(); >> + //nop_prio_vlow(); >> while (tb < ttb) >> tb =3D mftb(); >> - nop_prio_medium(); >> + //nop_prio_medium(); >> TSEXIT(); >> } >>=20 >> After this change I've not (yet?) seen another cpu_mp_unleash >> hangup in my test context. >>=20 >> Even if not documented to do so, it appears to me that >> ori Rx,Rx,Rx code that is behind the nop_prio_vlow() does >> something specific on the 970MP's in the 2-socket/2-core-each >> G5 PowerMac11,2's --and what it does interferes with making >> progress in DELAY, in at least that specific use of it and/or >> any others on the ap's during cpu_mp_unleash. >>=20 >> Of course, this testing process is of a probabilistic context >> and I do not have hundreds or more of examples of any specific >> condition at this point. But, so far, the change in behavior >> seems clear: I went from always-hanging-up-so-far to >> always-booting-so-far (when vt_upgrade did not prevent the >> test in each context). >=20 > So I uncommented the 2 calls commented out the contents of > the nop_prio_ routines, summarized here via: >=20 >=20 > # egrep -r 'or.*(31,31,31|1,1,1|6,6,6|2,2,2|5,5,5|3,3,3)' = /mnt/usr/src/sys/powerpc/ | more = = /mnt/usr/src/sys/powerpc/include/cpufunc.h: //__asm __volatile("or = 31,31,31"); > /mnt/usr/src/sys/powerpc/include/cpufunc.h: //__asm __volatile("or = 1,1,1"); > /mnt/usr/src/sys/powerpc/include/cpufunc.h: //__asm __volatile("or = 6,6,6"); > /mnt/usr/src/sys/powerpc/include/cpufunc.h: //__asm __volatile("or = 2,2,2"); > /mnt/usr/src/sys/powerpc/include/cpufunc.h: //__asm __volatile("or = 5,5,5"); > /mnt/usr/src/sys/powerpc/include/cpufunc.h: //__asm __volatile("or = 3,3,3"); [I had forgotten to list cpu.h 's use of the 27,27,27 form in cpu_spinwait.] I've reverted all such "or Rx,Rx,Rx" changes as having no overall change in observed behavior for any of the 3 hang-up areas. The new experiment . . . I've tried changing the loop to only moea64_bootstrap_slb_prefault about 1/2 of the kernel slb slots, leaving the others to be filled before mftb()%n_slbs based "random" replacements would start: /* - * Map the entire KVA range into the SLB. We must not fault = there. + * Map about 1/2 the kernel slb slots, leaving the others = available without + * mftb()%n_slbs replaceme being involved. */ #ifdef __powerpc64__ - for (va =3D virtual_avail; va < virtual_end; va +=3D = SEGMENT_LENGTH) + i =3D 0; + for (va =3D virtual_avail; va < virtual_end && i<(n_slbs-1)/2; = va +=3D SEGMENT_LENGTH, i++) moea64_bootstrap_slb_prefault(va, 0); #endif This change still gets hang-ups in the 3 same places but, so far, has booted more often. (Not that I have a large sampling at this point.) Definitely still still probabilistic. Other status . . . I've still not figured out a way to discover what happens at each of the 3 hang-up points in the areas identified. I'm also running out of ideas for what to do for possibly gaining other types of evidence. One hypothesis about why I've not seen these 3 hang-up places in my normal environment ( currently based on head -r345758 ), is that I normally run non-debug kernels in that context. Here it has all been debug kernels to better match with what artifacts.ci provides, among other reasons. For reference, the current -r347003 patches that I was last exploring, including showing the extra printf's for the 3 hang-up areas: # svnlite diff /mnt/usr/src/sys/ | more Index: /mnt/usr/src/sys/dev/vt/vt_core.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/dev/vt/vt_core.c (revision 347003) +++ /mnt/usr/src/sys/dev/vt/vt_core.c (working copy) @@ -2720,6 +2720,7 @@ =20 /* Init 25 Hz timer. */ callout_init_mtx(&vd->vd_timer, &vd->vd_lock, 0); +printf("vt_upgrade after callout_init_mtx / before sequence leading to = callout_reset\n"); =20 /* * Start timer when everything ready. @@ -2732,6 +2733,7 @@ atomic_add_acq_int(&vd->vd_timer_armed, 1); vd->vd_flags |=3D VDF_ASYNC; callout_reset(&vd->vd_timer, hz / VT_TIMERFREQ, = vt_timer, vd); +printf("vt_upgrade after callout_reset\n"); register_handlers =3D 1; } =20 Index: /mnt/usr/src/sys/fs/nfsserver/nfs_fha_new.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/fs/nfsserver/nfs_fha_new.c (revision 347003) +++ /mnt/usr/src/sys/fs/nfsserver/nfs_fha_new.c (working copy) @@ -93,9 +93,11 @@ * Initialize the sysctl context list for the fha module. */ sysctl_ctx_init(&softc->sysctl_ctx); +printf("fhanew_init: after sysctl_ctx_init / before = SYSCTL_ADD_NODE\n"); softc->sysctl_tree =3D SYSCTL_ADD_NODE(&softc->sysctl_ctx, SYSCTL_STATIC_CHILDREN(_vfs_nfsd), OID_AUTO, "fha", = CTLFLAG_RD, 0, "NFS File Handle Affinity (FHA)"); +printf("fhanew_init: after SYSCTL_ADD_NODE\n"); if (softc->sysctl_tree =3D=3D NULL) { printf("%s: unable to allocate sysctl tree\n", = __func__); return; Index: /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c (revision 347003) +++ /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c (working copy) @@ -956,10 +956,12 @@ virtual_end =3D VM_MAX_SAFE_KERNEL_ADDRESS;=20 =20 /* - * Map the entire KVA range into the SLB. We must not fault = there. + * Map about 1/2 the kernel slb slots, leaving the others = available without + * mftb()%n_slbs replaceme being involved. */ #ifdef __powerpc64__ - for (va =3D virtual_avail; va < virtual_end; va +=3D = SEGMENT_LENGTH) + i =3D 0; + for (va =3D virtual_avail; va < virtual_end && i<(n_slbs-1)/2; = va +=3D SEGMENT_LENGTH, i++) moea64_bootstrap_slb_prefault(va, 0); #endif =20 @@ -1090,8 +1092,8 @@ CPU_SET(PCPU_GET(cpuid), &pm->pm_active); =20 #ifdef __powerpc64__ - PCPU_SET(aim.userslb, pm->pm_slb); - __asm __volatile("slbmte %0, %1; isync" :: + PCPU_SET(aim.userslb, pm->pm_slb); // no slbie needed? + __asm __volatile("isync; slbmte %0, %1; isync" :: "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), "r"(USER_SLB_SLBE)); #else PCPU_SET(curpmap, pm->pmap_phys); @@ -1104,7 +1106,7 @@ { pmap_t pm; =20 - __asm __volatile("isync; slbie %0" :: "r"(USER_ADDR)); + __asm __volatile("isync; slbie %0; isync" :: "r"(USER_ADDR)); =20 pm =3D &td->td_proc->p_vmspace->vm_pmap; CPU_CLR(PCPU_GET(cpuid), &pm->pm_active); @@ -1956,7 +1958,7 @@ (uintptr_t)uaddr >> ADDR_SR_SHFT; curthread->td_pcb->pcb_cpu.aim.usr_vsid =3D slbv; #ifdef __powerpc64__ - __asm __volatile ("slbie %0; slbmte %1, %2; isync" :: + __asm __volatile ("isync; slbie %0; slbmte %1, %2; isync" :: "r"(USER_ADDR), "r"(slbv), "r"(USER_SLB_SLBE)); #else __asm __volatile("mtsr %0,%1; isync" :: "n"(USER_SR), = "r"(slbv)); Index: /mnt/usr/src/sys/powerpc/aim/moea64_native.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/moea64_native.c (revision = 347003) +++ /mnt/usr/src/sys/powerpc/aim/moea64_native.c (working copy) @@ -406,7 +406,7 @@ */ =20 #ifdef __powerpc64__ - __asm __volatile ("slbia"); + __asm __volatile ("isync; slbia"); __asm __volatile ("slbmfee %0,%1; slbie %0;" : = "=3Dr"(seg0) : "r"(0)); =20 @@ -417,6 +417,7 @@ __asm __volatile ("slbmte %0, %1" ::=20 "r"(slb[i].slbv), "r"(slb[i].slbe));=20 } + __asm __volatile ("isync"); #else for (i =3D 0; i < 16; i++) mtsrin(i << ADDR_SR_SHFT, = kernel_pmap->pm_sr[i]); Index: /mnt/usr/src/sys/powerpc/aim/slb.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/slb.c (revision 347003) +++ /mnt/usr/src/sys/powerpc/aim/slb.c (working copy) @@ -457,7 +457,7 @@ /* If it is for this CPU, put it in the SLB right away */ if (pmap_bootstrapped) { /* slbie not required */ - __asm __volatile ("slbmte %0, %1" ::=20 + __asm __volatile ("isync; slbmte %0, %1; isync" ::=20 "r"(slbcache[i].slbv), "r"(slbcache[i].slbe));=20 } =20 Index: /mnt/usr/src/sys/powerpc/aim/trap_subr64.S =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/trap_subr64.S (revision 347003) +++ /mnt/usr/src/sys/powerpc/aim/trap_subr64.S (working copy) @@ -65,6 +65,7 @@ =20 li %r29, 0 /* Set the counter to zero */ =20 + isync slbia slbmfee %r31,%r29 =20 clrrdi %r31,%r31,28 @@ -71,6 +72,7 @@ slbie %r31 1: ld %r31, 0(%r28) /* Load SLB entry pointer */ cmpdi %r31, 0 /* If NULL, stop */ + isync beqlr =20 ld %r30, 0(%r31) /* Load SLBV */ @@ -96,6 +98,7 @@ /* Otherwise, set up SLBs */ li %r29, 0 /* Set the counter to zero */ =20 + isync slbia slbmfee %r31,%r29 =20 clrrdi %r31,%r31,28 @@ -105,6 +108,7 @@ =20 ld %r31, 8(%r28) /* Load SLBE */ cmpdi %r31, 0 /* If SLBE is not valid, stop */ + isync beqlr ld %r30, 0(%r28) /* Load SLBV */ slbmte %r30, %r31 /* Install SLB entry */ @@ -113,6 +117,7 @@ addi %r29, %r29, 1 cmpdi %r29, 64 /* Repeat if we are not at the = end */ blt 1b=20 + isync blr =20 /* Index: /mnt/usr/src/sys/powerpc/include/cpufunc.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/include/cpufunc.h (revision 347003) +++ /mnt/usr/src/sys/powerpc/include/cpufunc.h (working copy) @@ -155,15 +155,8 @@ return (tb); } =20 -static __inline void -mttb(u_quad_t time) -{ +// mttb moved to after intr_restore =20 - mtspr(TBR_TBWL, 0); - mtspr(TBR_TBWU, (uint32_t)(time >> 32)); - mtspr(TBR_TBWL, (uint32_t)(time & 0xffffffff)); -} - static __inline void eieio(void) { @@ -202,6 +195,19 @@ mtmsr(msr); } =20 +static __inline void +mttb(u_quad_t time) +{ + const uint32_t high=3D time>>32; + const uint32_t low=3D time&0xffffffffu; + + const register_t predisable_msr=3D intr_disable(); + mtspr(TBR_TBWL, 0); + mtspr(TBR_TBWU, high); + mtspr(TBR_TBWL, low); + intr_restore(predisable_msr); +} + static __inline struct pcpu * get_pcpu(void) { Index: /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c (revision = 347003) +++ /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c (working copy) @@ -286,8 +286,10 @@ if (smp_cpus > 1) atomic_store_rel_int(&smp_started, 1); =20 +printf("cpu_mp_unleash: before DELAY\n"); /* Let the APs get into the scheduler */ DELAY(10000); +printf("cpu_mp_unleash: after DELAY\n"); =20 } =20 Index: /mnt/usr/src/sys/powerpc/powerpc/trap.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/powerpc/trap.c (revision 347003) +++ /mnt/usr/src/sys/powerpc/powerpc/trap.c (working copy) @@ -453,8 +453,8 @@ #if defined(__powerpc64__) && defined(AIM) case EXC_DSE: if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0 && - (frame->dar & SEGMENT_MASK) =3D=3D = USER_ADDR) { - __asm __volatile ("slbmte %0, %1" :: + (frame->dar & SEGMENT_MASK) =3D=3D = USER_ADDR) { // no slbie needed? + __asm __volatile ("isync; slbmte %0, %1; = isync" :: = "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), "r"(USER_SLB_SLBE)); return; @@ -712,8 +712,8 @@ * Speculatively restore last user SLB segment, which we know is * invalid already, since we are likely to do = copyin()/copyout(). */ - if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0) - __asm __volatile ("slbmte %0, %1; isync" :: + if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0) // no slbie needed? + __asm __volatile ("isync; slbmte %0, %1; isync" :: "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), = "r"(USER_SLB_SLBE)); #endif =20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)