Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 Mar 2019 06:57:16 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Mark Millard <marklmi@yahoo.com>, Bruce Evans <brde@optusnet.com.au>,  freebsd-hackers Hackers <freebsd-hackers@freebsd.org>,  FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]
Message-ID:  <20190315064830.F7981@besplex.bde.org>
In-Reply-To: <20190314193946.GJ2492@kib.kiev.ua>
References:  <20190303111931.GI68879@kib.kiev.ua> <20190303223100.B3572@besplex.bde.org> <20190303161635.GJ68879@kib.kiev.ua> <20190304043416.V5640@besplex.bde.org> <20190304114150.GM68879@kib.kiev.ua> <20190305031010.I4610@besplex.bde.org> <20190306172003.GD2492@kib.kiev.ua> <20190308001005.M2756@besplex.bde.org> <20190307222220.GK2492@kib.kiev.ua> <5EED3352-2E8C-4BEE-B281-4AC8DE9570C2@yahoo.com> <20190314193946.GJ2492@kib.kiev.ua>

index | next in thread | previous in thread | raw e-mail

On Thu, 14 Mar 2019, Konstantin Belousov wrote:

> On Thu, Mar 07, 2019 at 05:29:51PM -0800, Mark Millard wrote:
>> A basic question and a small note.
>>
>> Question's context for it tc->tc_get_timecount(tc) values:
>>
>> In the powerpc64 context tc->tc_get_timecount(tc) is the lower
>> 32 bits of the tbr, in my context having a 33,333,333 MHz or so
>> increment rate for a machine with a 2.5 GHz or so clock rate.
>> The truncated 32 bit tbr value wraps every 128 seconds or so.
>> 2 sockets, 2 cores per socket, so 4 separate tbr values.
>>
>> The question is . . .
>>
>> In tc_delta's:
>>
>>     tc->tc_get_timecount(tc) - th->th_offset_count
>>
>> is observing tc->tc_get_timecount(tc) < th->th_offset_count
>> ever supposed to be possible in correct operation, other than
>> tc->tc_get_timecount(tc) having wrapped around (and so being
>> newly 0 or "near" 0, no evidence of of having it having been
>> near 128 seconds or more for my context)?
> I think yes, there is no reason for current get_timecount() value
> to have any arithmetic relation to th_offset_count.  Look at tc_windup()
> on how the th_offset_count is calculated.  The final value is clamped
> by the tc_counter_mask, so only lower bits are important (higher bits
> are evacuated to th_offset or lost due to overflow if tc_windup()
> was not called soon enough).

Yes, it is a standard method to calculate time differences from a possibly-
wrapped counter as (finish - start) & mask in unsigned arithmetic, where
the counter must be checked before it wraps relative to 'start'.

>> The note:
>>
>> On 2019-Mar-7, at 14:22, Konstantin Belousov <kostikbel@gmail.com> wrote:
>>
>>> . . .
>>> +
>>> +	if (__predict_false(delta < large_delta)) {
>>
>> I thought that delta<large_delta was the non-overflow context
>> for scale*delta and that the overflow case for the multiplication
>> was when delta>=large_delta .
> You are right, I fixed this in my repo.
>>
>>> +		/* Avoid overflow for scale * delta. */
>>> +		x = (scale >> 32) * delta;
>>> +		bt->sec += x >> 32;
>>> +		bintime_addx(bt, x << 32);
>>> +		bintime_addx(bt, (scale & 0xffffffff) * delta);
>>> +	} else {
>>> +		bintime_addx(bt, scale * delta);
>>> +	}
>>> . . .

Fixed in my version too.

I might have helped break this.  I reversed the condition to get the
unusual path executed (though not when it overflow), and forgot to
undo this.  At least the unusual path got checked more).

Bruce


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190315064830.F7981>