Date: Sat, 2 Mar 2019 16:25:21 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: Mark Millard <marklmi@yahoo.com>, freebsd-hackers Hackers <freebsd-hackers@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190302142521.GE68879@kib.kiev.ua> In-Reply-To: <20190302225513.W3408@besplex.bde.org> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> <20190302071425.G5025@besplex.bde.org> <20190302105140.GC68879@kib.kiev.ua> <20190302225513.W3408@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Mar 03, 2019 at 12:03:18AM +1100, Bruce Evans wrote: > On Sat, 2 Mar 2019, Konstantin Belousov wrote: > > > On Sat, Mar 02, 2019 at 07:38:20AM +1100, Bruce Evans wrote: > >> On Fri, 1 Mar 2019, Konstantin Belousov wrote: > >>> + scale = th->th_scale; > >>> + delta = tc_delta(th); > >>> + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { > >> > >> Better, but shouldn't be changed (and the bug that causes the large intervals > >> remains unlocated), and if it is changed then it should use: > >> > >> if (delta >= th->th_large_delta) > >> > >>> @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin) > >>> scale += (th->th_adjustment / 1024) * 2199; > >>> scale /= th->th_counter->tc_frequency; > >>> th->th_scale = scale * 2; > >>> +#ifdef _LP64 > >>> + th->th_scale_bits = ffsl(th->th_scale); > >>> +#else > >>> + th->th_scale_bits = ffsll(th->th_scale); > >>> +#endif > >> > >> th->th_large_delta = ((uint64_t)1 << 63) / scale; > > > > So I am able to reproduce it with some surprising ease on HPET running > > on Haswell. > > So what is the cause of it? Maybe the tickless code doesn't generate > fake clock ticks right. Or it is just a library bug. The kernel has > to be slightly real-time to satisfy the requirement of 1 update per. > Applications are further from being real-time. But isn't it enough > for the kernel to ensure that the timehands cycle more than once per > second? No, I entered ddb as you suggested. > > > I looked at the generated code for libc which still uses ffsll() on 32bit, > > due to the ABI issues. At least clang generates two BSF instructions for > > this code, so I think that forking vdso_timehands ABI for this is not > > reasonable right now. > > > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > > index 3749e0473af..3c3c71207bd 100644 > > --- a/lib/libc/sys/__vdso_gettimeofday.c > > +++ b/lib/libc/sys/__vdso_gettimeofday.c > > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > > #include <sys/time.h> > > #include <sys/vdso.h> > > #include <errno.h> > > +#include <limits.h> > > +#include <strings.h> > > #include <time.h> > > #include <machine/atomic.h> > > #include "libc_private.h" > > @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > { > > struct vdso_timehands *th; > > uint32_t curr, gen; > > - u_int delta; > > + uint64_t scale, x; > > + u_int delta, scale_bits; > > int error; > > > > do { > > @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > continue; > > if (error != 0) > > return (error); > > - bintime_addx(bt, th->th_scale * delta); > > + scale = th->th_scale; > > +#ifdef _LP64 > > + scale_bits = ffsl(scale); > > +#else > > + scale_bits = ffsll(scale); > > +#endif > > I see that there is an ABI problem adding th_large_delta. > > > + if (__predict_false(scale_bits + fls(delta) > 63)) { > > + x = (scale >> 32) * delta; > > + scale &= UINT_MAX; > > Should be UINT32_MAX or better 0xffffffff. Ok. > > > + bt->sec += x >> 32; > > + bintime_addx(bt, x << 32); > > + } > > + bintime_addx(bt, scale * delta); > > if (abs) > > bintime_add(bt, &th->th_boottime); > > I don't changing this at all this. binuptime() was carefully written > to not need so much 64-bit arithmetic. > > If this pessimization is allowed, then it can also handle a 64-bit > deltas. Using the better kernel method: > > if (__predict_false(delta >= th->th_large_delta)) { > bt->sec += (scale >> 32) * (delta >> 32); > x = (scale >> 32) * (delta & 0xffffffff); > bt->sec += x >> 32; > bintime_addx(bt, x << 32); > x = (scale & 0xffffffff) * (delta >> 32); > bt->sec += x >> 32; > bintime_addx(bt, x << 32); > bintime_addx(bt, (scale & 0xffffffff) * > (delta & 0xffffffff)); > } else > bintime_addx(bt, scale * (delta & 0xffffffff)); This only makes sense if delta is extended to uint64_t, which requires the pass over timecounters. > > I just noticed that there is a 64 x 32 -> 64 bit multiplication in the > current method. This can be changed to do expicit 32 x 32 -> 64 bit > multiplications and fix the overflow problem at small extra cost on > 32-bit arches: > > /* 32-bit arches did the next multiplication implicitly. */ > x = (scale >> 32) * delta; > /* > * And they did the following shifts and most of the adds > * implicitly too. Except shifting x left by 32 lost the > * seconds part that the next line handles. The next line > * is the only extra cost for them. > */ > bt->sec += x >> 32; > bintime_addx(bt, (x << 32) + (scale & 0xffffffff) * delta); Ok, what about the following. diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c index 3749e0473af..cfe3d96d001 100644 --- a/lib/libc/sys/__vdso_gettimeofday.c +++ b/lib/libc/sys/__vdso_gettimeofday.c @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); #include <sys/time.h> #include <sys/vdso.h> #include <errno.h> +#include <limits.h> +#include <strings.h> #include <time.h> #include <machine/atomic.h> #include "libc_private.h" @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) { struct vdso_timehands *th; uint32_t curr, gen; - u_int delta; + uint64_t scale, x; + u_int delta, scale_bits; int error; do { @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) continue; if (error != 0) return (error); - bintime_addx(bt, th->th_scale * delta); + scale = th->th_scale; +#ifdef _LP64 + scale_bits = ffsl(scale); +#else + scale_bits = ffsll(scale); +#endif + if (__predict_false(scale_bits + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= 0xffffffff; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); if (abs) bintime_add(bt, &th->th_boottime); diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..2e28f872229 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -72,6 +72,7 @@ struct timehands { struct timecounter *th_counter; int64_t th_adjustment; uint64_t th_scale; + uint64_t th_large_delta; u_int th_offset_count; struct bintime th_offset; struct bintime th_bintime; @@ -351,17 +352,44 @@ fbclock_getmicrotime(struct timeval *tvp) } while (gen == 0 || gen != th->th_generation); } #else /* !FFCLOCK */ + +static void +bintime_helper(struct bintime *bt, uint64_t *scale, u_int delta) +{ + uint64_t x; + + x = (*scale >> 32) * delta; + *scale &= 0xffffffff; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); +} + void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); +#ifdef _LP64 + /* Avoid overflow for scale * delta. */ + if (__predict_false(th->th_large_delta <= delta)) + bintime_helper(bt, &scale, delta); + bintime_addx(bt, scale * delta); +#else + /* + * Also avoid (uint64_t, uint32_t) -> uint64_t + * multiplication on 32bit arches. + */ + bintime_helper(bt, &scale, delta); + bintime_addx(bt, (u_int)scale * delta); +#endif atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +416,28 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); +#ifdef _LP64 + /* Avoid overflow for scale * delta. */ + if (__predict_false(th->th_large_delta <= delta)) + bintime_helper(bt, &scale, delta); + bintime_addx(bt, scale * delta); +#else + /* + * Also avoid (uint64_t, uint32_t) -> uint64_t + * multiplication on 32bit arches. + */ + bintime_helper(bt, &scale, delta); + bintime_addx(bt, (u_int)scale * delta); +#endif atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -1464,6 +1507,7 @@ tc_windup(struct bintime *new_boottimebin) scale += (th->th_adjustment / 1024) * 2199; scale /= th->th_counter->tc_frequency; th->th_scale = scale * 2; + th->th_large_delta = ((uint64_t)1 << 63) / scale; /* * Now that the struct timehands is again consistent, set the new
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190302142521.GE68879>