From owner-freebsd-hackers@freebsd.org Fri Apr 5 08:40:19 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9F4AE156EB7D for ; Fri, 5 Apr 2019 08:40:19 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic316-8.consmr.mail.gq1.yahoo.com (sonic316-8.consmr.mail.gq1.yahoo.com [98.137.69.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 41FDB850A9 for ; Fri, 5 Apr 2019 08:40:18 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: VXMvz5oVM1lVot.ZVGNj0UQSOyj6Ni7RoNkSCdjgRVXFBnGiPKOjpE5Zi6a5y.X qN9LPcDwHp_bK4aECSY19vbASlQ0YO1klqDnu_6De9akLMgvCY4nNbZ6unYeC_wflLQIEx5yVUfm W4EDvxzfaYetPY1SdR8LbXte7Z4wXLVQD3ESae70sBu3mxGJUx.RIa20UZv6zveRgRPzHMrUeUZ3 oknukAca4A9fXK8nCmLU35OCV_5MM4WZdi03iY_aEZSI4FpmsvShRoaNL5R5eknvLVVpKLkrfWuL 6j81MDZfSHZpJG_clu5cwlAgU6HJd0u04N5GbLVTuAMXUiE3KuWLBUORsGNU1NJLCBe2AL0Z.NtJ mbwsmxmvvS1T25yWiYRicNbseNcBQfZErelkOeX1yDkKIwwRAr6bE_V2E7Nd5Qxo9MuBrJcc7v6m aL0lkqyehIsJXiQ152wrlQW2idWPHVvo1uGS.SkzZBE1E14e547hGtZnkgzwVZYkDZeZ_zZdkEL9 anTz7ZNwo0U7zlbKV2X8tCdEyGk3gfYcT6.6u9a2813jx.79t4Hprc3xN6U8iliqcD5cQfwWos_p C_ZagEBWfau8GFgDmsURlaW_tt17u7eHad7dUunI2nICXaLRQPXgnFQ52QkuK.IT2_Vyx.mQgJAN cEv.LxZ34lI4jb8jghgcAzSy401vtLOkKc9rsbEeDssJ7g4PESmejxmoIh8XL4Yo12u5PY_r4VJu gkbJdhsj1A7R15S7sF56wAVRK2dBGuSKBiGCPY5scR2.CJBwUezUyTjFGARm9JEgz3xibvqERgLh tR2QwVa79V8ZuD7LLQBaOIZ3yYKgGwkGq53FZMkpKymnhxEaU5RcPIgIMv4CO3uqR5ahJzLV5D1X WNryDviRgCHNAvcZikXZQEA3jzRxZmRbqk4puAhPMrCp7RDIgGcpqRrJF4AvH711D84xHT6SLYwG DLBB.BZ0vFB1o4x64_GgxRARxik1LtuhaSIgD_RKXRihR1TV3DN.CQ7GtG4p4hICocTswXgZDnPL A2dUyvaH8_VLQ5Pn.vLaiO5QkjU6d.Qp7B0vvY7TrCG8FOKVo6ow5Gszm7QBr4SmkyVKwYoq.3o9 b8g-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic316.consmr.mail.gq1.yahoo.com with HTTP; Fri, 5 Apr 2019 08:40:10 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp428.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 4ebaa5d02eedc5f0d26af1d554561570; Fri, 05 Apr 2019 08:40:08 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <20190405150236.A959@besplex.bde.org> Date: Fri, 5 Apr 2019 01:40:07 -0700 Cc: Konstantin Belousov , freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: References: <20190303161635.GJ68879@kib.kiev.ua> <20190304043416.V5640@besplex.bde.org> <20190304114150.GM68879@kib.kiev.ua> <20190305031010.I4610@besplex.bde.org> <20190306172003.GD2492@kib.kiev.ua> <20190308001005.M2756@besplex.bde.org> <20190307222220.GK2492@kib.kiev.ua> <20190309144844.K1166@besplex.bde.org> <20190324110138.GR1923@kib.kiev.ua> <20190403070045.GW1923@kib.kiev.ua> <20190404011802.E2390@besplex.bde.org> <20190405150236.A959@besplex.bde.org> To: Bruce Evans X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: 41FDB850A9 X-Spamd-Bar: + X-Spamd-Result: default: False [1.83 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_SPAM_SHORT(0.81)[0.808,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.28)[ip: (4.75), ipnet: 98.137.64.0/21(0.95), asn: 36647(0.76), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.03)[0.028,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.23)[0.226,0]; RCVD_IN_DNSWL_NONE(0.00)[32.69.137.98.list.dnswl.org : 127.0.5.0]; FREEMAIL_CC(0.00)[gmail.com] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Apr 2019 08:40:19 -0000 On 2019-Apr-4, at 21:38, Bruce Evans wrote: > On Thu, 4 Apr 2019, Mark Millard wrote: >=20 >> On 2019-Apr-3, at 08:47, Bruce Evans wrote: >>> . . . >>>=20 >>> I noticed (or better realized) a general problem with multiple >>> timehands. ntpd can slew the clock at up to 500 ppm, and at least = an >>> old version of it uses a rate of 50 ppm to fix up fairly small = drifts >>> in the milliseconds range. 500 ppm is enormous in CPU cycles -- it = is >>> 500 thousand nsec or 2 million cycles at 4GHz. Winding up the = timecounter >>> every 1 msec reduces this to only 2000 cycles. >>>=20 >>> More details of ordering and timing for 1 thread: >>> ... >> Thanks for the description of an example way that sbinuptime and >> the like might not give weakly increasing results. >>=20 >> Unfortunately, all the multi-socket contexts that I sometimes have >> access to are old PowerMacs. And, currently, the only such context >> is the G5 with 2 sockets, 2 cores per socket (powerpc64). So I've >> not been able to set up other types of examples to see if problems >> repeat. >>=20 >> I do not have access to a single-socket powerpc64 for contrast in >> that direction. >=20 > Testing 1 socket is time-consuming enough. Do these old systems > use the equivalent of an x86 TSC for the timecounter? With multiple > sockets, it isn't clear how even a hardware timer independent of the > CPUs can be distributed so as to appear to be monotonic on all cors. "The DEC frequency is based on the same implementation-dependent frequency that drives the time base." The frequency may well be fixed by the PowerMac G5 model implementation but is not fixed by the powerpc64 architecture. The Time Base Register (TBR) in a powerpc64 core (cpu in FreeBSD terms) increments at 33,333,333 Hz (nominal) and is 64 bits wide. Its value can be set (mttb instruction) and the boot sequence in FreeBSD does attempt to adjust as the FreeBSD CPU is brought-up/started. mftb is used to read the 64-bit value. FreeBSD masks it down to 32-bits to contribute to the time-counter. (Is that description sufficient for what you were after? I've never seen documentation of how the 33,333,333 MHz is produced.) As FreeBSD supports multi-socket, what are its criteria for a sufficient context for it to work with for supporting sbinuptime and the like? Is FreeBSD supposed to then make it appear that sbinputime and the like are weakly increasing, even as threads migrate between CPUs (cores, hw-threads)? >> One oddity is that the eventtimer's decrementer and timecounter >> may change (nearly) together: both change at 33,333,333 Hz, as if >> they are tied to the same clock (at least on one socket). >=20 > I think this is from a normal hardware implementation. On all of > my x86 systems with a TSC, the TSC frequency is an exact fractional > multiple of the i8254, the ACPI timer (if present) and the HPET (if > present). Only the RTC has an independent frequency. The fraction is > changed by changing the nominal TSC frequency in the BIOS, but is not > changed by temperature variations. This must be because most clocks = are > derived from a common clock using a PLL. I use this to calibrate all > clocks (except the RTC) by calibrating only 1. I'm not aware of the OpenFirmware having any control over the TBR-change frequency behavior. I've no evidence about any variability based on temperature. >> In case it helps with knowing how analogous your investigations >> are to the original problem context, I report the following. >>=20 >> If you do not care for such information, stop reading here. >>=20 >> # grep ntpd /etc/rc.conf >> ntpd_enable=3D"YES" >> ntpd_sync_on_start=3D"YES" >>=20 >> # sysctl kern.eventtimer >> kern.eventtimer.periodic: 0 >> kern.eventtimer.timer: decrementer >> kern.eventtimer.idletick: 0 >> kern.eventtimer.singlemul: 2 >> kern.eventtimer.choice: decrementer(1000) >> kern.eventtimer.et.decrementer.quality: 1000 >> kern.eventtimer.et.decrementer.frequency: 33333333 >> kern.eventtimer.et.decrementer.flags: 7 >>=20 >> # vmstat -ai | grep decrementer >> cpu0:decrementer 4451007 35 >> cpu3:decrementer 1466010 11 >> cpu2:decrementer 1481722 12 >> cpu1:decrementer 1478618 12 >=20 > Powerpc seems to have a PLL in software too. Event timers don't need = to > be very precise or accurate. powerpc64 sets the value to count down from (in the 32-bit DEC regsiter) via the mtdec instruction. I'm not ware of being=20 able to change the frequency that the countdown occurs at on the old PowerMac G5's. (mtdec is a form of mtspr SPRNUM,rs .) [Accessing the DEC's value is via a privileged instruction on powerpc64 vs. a non-privileged one on power, different instruction encodings but the same mnemonic unless mfspr is used directly as the mnemonic. Just a difference in one bit position in the SPUNUM in teh encoding.] >> (That last is from a basically-idle timeframe.) >>=20 >> # sysctl -a | grep hz >> kern.clockrate: { hz =3D 1000, tick =3D 1000, profhz =3D 8128, stathz = =3D 127 } >> kern.hz: 1000 >=20 > x86 is similar. I think synchronization from using PLLs still gives > unfair scheduling, but with multiple CPUs and often more cycles than = can > be used, no one cares about accidental synchronization or bothers to = steal > cycles using intentional synchronization. Good to know. >> # sysctl kern.timecounter >> kern.timecounter.fast_gettime: 1 >> kern.timecounter.tick: 1 >> kern.timecounter.choice: timebase(0) dummy(-1000000) >> kern.timecounter.hardware: timebase >> kern.timecounter.alloweddeviation: 5 >> kern.timecounter.stepwarnings: 0 >> kern.timecounter.tc.timebase.quality: 0 >> kern.timecounter.tc.timebase.frequency: 33333333 >> kern.timecounter.tc.timebase.counter: 1144662532 >> kern.timecounter.tc.timebase.mask: 4294967295 >>=20 >> (The actual Time Base Register (tbr) i s 64 bits >> and freebsd truncates it down.) >>=20 >> # sysctl -a | grep 'cpu.*freq' >> device cpufreq >> debug.cpufreq.verbose: 0 >> debug.cpufreq.lowest: 0 >> dev.cpufreq.0.%parent: cpu3 >> dev.cpufreq.0.%pnpinfo: >> dev.cpufreq.0.%location: >> dev.cpufreq.0.%driver: cpufreq >> dev.cpufreq.0.%desc: >> dev.cpufreq.%parent: >> dev.cpu.3.freq_levels: 2500/-1 1250/-1 >> dev.cpu.3.freq: 2500 >>=20 >> So 2500 MHz / 33333333 Hz is very near 75 clock periods per >> timebase counter value. >=20 > Looks like it is exactly 75. Fractions are especially easy to guess = and > verify when they are integral. I'm not sure what happens for DEC and TBR change frequencies if the 2500 MHz cpu frequency is dropped down to 1250 MHz. But as I understand my context, 1250 MHz is not in use at all, just 2500 MHz. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)