Date: Thu, 14 Mar 2019 14:05:57 -0700 From: Mark Millard <marklmi@yahoo.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: Bruce Evans <brde@optusnet.com.au>, freebsd-hackers Hackers <freebsd-hackers@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <E6A33A82-F98C-4BFA-97B5-16F930586E6C@yahoo.com> In-Reply-To: <20190314193946.GJ2492@kib.kiev.ua> References: <20190303111931.GI68879@kib.kiev.ua> <20190303223100.B3572@besplex.bde.org> <20190303161635.GJ68879@kib.kiev.ua> <20190304043416.V5640@besplex.bde.org> <20190304114150.GM68879@kib.kiev.ua> <20190305031010.I4610@besplex.bde.org> <20190306172003.GD2492@kib.kiev.ua> <20190308001005.M2756@besplex.bde.org> <20190307222220.GK2492@kib.kiev.ua> <5EED3352-2E8C-4BEE-B281-4AC8DE9570C2@yahoo.com> <20190314193946.GJ2492@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2019-Mar-14, at 12:39, Konstantin Belousov <kostikbel at gmail.com> = wrote: > On Thu, Mar 07, 2019 at 05:29:51PM -0800, Mark Millard wrote: >> A basic question and a small note. >>=20 >> Question's context for it tc->tc_get_timecount(tc) values:=20 >>=20 >> In the powerpc64 context tc->tc_get_timecount(tc) is the lower >> 32 bits of the tbr, in my context having a 33,333,333 MHz or so >> increment rate for a machine with a 2.5 GHz or so clock rate. >> The truncated 32 bit tbr value wraps every 128 seconds or so. >> 2 sockets, 2 cores per socket, so 4 separate tbr values. >>=20 >> The question is . . . >>=20 >> In tc_delta's: >>=20 >> tc->tc_get_timecount(tc) - th->th_offset_count >>=20 >> is observing tc->tc_get_timecount(tc) < th->th_offset_count >> ever supposed to be possible in correct operation, other than >> tc->tc_get_timecount(tc) having wrapped around (and so being=20 >> newly 0 or "near" 0, no evidence of of having it having been >> near 128 seconds or more for my context)? > I think yes, there is no reason for current get_timecount() value > to have any arithmetic relation to th_offset_count. Look at = tc_windup() > on how the th_offset_count is calculated. The final value is clamped > by the tc_counter_mask, so only lower bits are important (higher bits > are evacuated to th_offset or lost due to overflow if tc_windup() > was not called soon enough). >=20 Okay. Thanks. Just FYI: I asked because in my powerpc64 context I was seeing (in sleepq_timeout) td->td_sleeptimo > sbinuptime() in: if (td->td_sleeptimo > sbinuptime() || td->td_sleeptimo =3D=3D = 0) { /* * The thread does not want a timeout (yet). */ and without such sleeps being rescheduled in that case, those sleeps hang up. My hack to temporarily enable useful operation was to have binuptime avoid tc->tc_get_timecount(tc) < th->th_offset_count for small enough differences, as shown below: . . . do { do { // HACK!!! th=3D timehands; tc=3D th->th_counter; gen=3D atomic_load_acq_int(&th->th_generation); tim_cnt=3D tc->tc_get_timecount(tc); tim_offset=3D th->th_offset_count; tim_wrong_order_diff=3D tim_offset-tim_cnt; } while (tim_cnt<tim_offset && = tim_wrong_order_diff<wrong_order_diff_proper_upper_bound); // HACK!!! *bt =3D th->th_offset; . . . where I experimentally came up with the following for the specific = PowerMac G5 context: u_int const wrong_order_diff_proper_upper_bound=3D 0x14u; // = 0x11 is max observed diff so far HACK!!! I've not hand any hung-up sleeps after that change. Despite being a = hack, this gives evidence that tc->tc_get_timecount(tc) < th->th_offset_count for small enough differences (in binuptime) is involved in the hangups in some essential way for the PowerMac G5 context. I look forward to removing this hack at some point, when things just work for this 2 socket, 2 cores per socket powerpc64 context. But for now the hack is locally useful. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E6A33A82-F98C-4BFA-97B5-16F930586E6C>