From owner-freebsd-hackers@freebsd.org Thu Feb 28 14:55:55 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 941A6150A71C; Thu, 28 Feb 2019 14:55:55 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 162356E768; Thu, 28 Feb 2019 14:55:54 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x1SEthXj081104 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 28 Feb 2019 16:55:46 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x1SEthXj081104 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x1SEtgqQ081103; Thu, 28 Feb 2019 16:55:42 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Thu, 28 Feb 2019 16:55:42 +0200 From: Konstantin Belousov To: Mark Millard Cc: FreeBSD PowerPC ML , freebsd-hackers Hackers Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes Message-ID: <20190228145542.GT2420@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.2 (2019-01-07) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 14:55:55 -0000 On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via freebsd-ppc wrote: > Basic context: > > The code for sleeps of various forms depends > on calls to: > > static __inline sbintime_t > sbinuptime(void) > { > struct bintime _bt; > > binuptime(&_bt); > return (bttosbt(_bt)); > } > > and comparisons with the return values, such > as checking for timeouts. The upper 32 bits > of the unsigned 64 bit result has the seconds > and the lower 32 bits has the fraction as > a multiplier of 1sec/(2**64). > > An observed problem is that later sbinuptime calls > sometimes end up with smaller values than earlier > ones. (Past lisy freebsd-ppc messages give details.) > This makes for problems with checking for timeouts > when using later sbinuptime() calls after a timeout > was initially detected against an earlier value: > > A.0) timercb getting the earlier sbinuptime() value > A.1) callout_process using that to detect a timeout, > B) sleepq_timeout checking the timeout again, > using a separate sbinuptime() call. > > Some details about example values, overflows, and such follow. > > I used the following sort of hacked code to report values when > overflows happen: > > #if defined(__powerpc64__) && defined(AIM) > void > binuptime(struct bintime *bt) > { > struct timehands *th; > u_int gen; > > struct timecounter *tc; // HACK!!! > u_int tim_cnt, tim_offset, tim_diff; // HACK!!! > uint64_t freq, scale_factor, diff_scaled; // HACK!!! > > do { > th = timehands; > tc = th->th_counter; // HACK!!! > gen = atomic_load_acq_int(&th->th_generation); > *bt = th->th_offset; > tim_cnt= tc->tc_get_timecount(tc); // HACK!!! (steps of tc_diff with values recorded) > tim_offset= th->th_offset_count; // HACK!!! > tim_diff= (tim_cnt - tim_offset) & tc->tc_counter_mask; // HACK!!! > scale_factor= th->th_scale; // HACK!!! > diff_scaled= scale_factor * tim_diff; // HACK!!! > bintime_addx(bt, diff_scaled); // HACK!!! > freq= tc->tc_frequency; // HACK!!! > atomic_thread_fence_acq(); > } while (gen == 0 || gen != th->th_generation); > > if (*(volatile uint64_t*)0xc0000000000000f0==0u && (0xffffffffffffffffull/scale_factor) *(volatile uint64_t*)0xc0000000000000d0= freq; > *(volatile uint64_t*)0xc0000000000000d8= scale_factor; > *(volatile u_int*)0xc0000000000000e0= tim_offset; > *(volatile u_int*)0xc0000000000000e4= tim_cnt; > *(volatile u_int*)0xc0000000000000e8= tim_diff; > *(volatile uint64_t*)0xc0000000000000f0= diff_scaled; > *(volatile uint64_t*)0xc0000000000000f8= scale_factor*freq; > __asm__ ("sync"); > } > } > #else > . . . > #endif > > (mtfb() is used to provide the tc->tc_get_timecount(tc) > value --but only the lower 32 bits are extracted and > returned.) > > Basically whenever tim_diff is such that: > > (0xffffffffffffffff/scale_factor) > then diff_scaled overflows an unsigned, 64 bit representation, > ending up with just the least 64 bits. This truncated value > ends up being used in: > > bintime_addx(bt, diff_scaled); > > Observed consistently for tc->tc_frequency: > > tc->tc_frequency == 0x1fca055 (i.e., 33333333) > > ( tc->tc_counter_mask is 0xfffffffful as well. ) > > An example observation of diff_scaled having an overflowed > value is: > > scale_factor == 0x80da2067ac > scale_factor*freq overflows unsigned, 64 bit representation. > tim_offset == 0x3da0eaeb > tim_cnt == 0x42dea3c4 > tim_diff == 0x53db8d9 > For reference: 0x1fc9d43 == 0xffffffffffffffffull/scale_factor > scaled_diff == 0xA353A5BF3FF780CC (truncated to 64 bits) > > So scale_factor * tim_diff leaves diff_scaled truncated to > the least significant 64 bits, which does not preserve > ordering properties. > > Another example: > > scale_factor == 0x80d95962c0 > scale_factor*freq == 0xfffffffffd65c9c0 > tim_offset == 0x4d1fb8e2 > tim_cnt == 0x4d1fb8e1 > tim_diff == 0xffffffff > For reference: 0x1fca055 == 0xffffffffffffffffull/scale_factor > scaled_diff == 0xD959623F26A69D40 (truncated to 64 bits) > > Again the diff_scaled holds a truncated value from > scale_factor * tim_diff . > > Another example: > > scale_factor == 0x80da20c940 > scale_factor*freq overflows unsigned, 64 bit representation. > tim_offset == 0x9a7f5cdb > tim_cnt == 0xb26bbd5 > tim_diff == 0x70a75efa > For reference: 0x1fc9d41 == 0xffffffffffffffffull/scale_factor > scaled_diff == 0xB3AC715C56AA0880 (truncated to 64 bits) > > Again the diff_scaled holds a truncated value from > scale_factor * tim_diff . > > Note that the scale_factor does vary. > Try the following (I did not even booted it). If worked out, ffclock counterpart also needs the patching. diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..19e81bbf023 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -355,13 +355,20 @@ void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (fls(scale) + fls(delta) > 63) { + bt->sec += (scale >> 32) * delta; + scale &= UINT_MAX; + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +395,20 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (fls(scale) + fls(delta) > 63) { + bt->sec += (scale >> 32) * delta; + scale &= UINT_MAX; + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); }