Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Jun 2011 14:13:26 +0200
From:      Svatopluk Kraus <onwahe@gmail.com>
To:        Uffe Jakobsen <uffe@uffe.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: threads runtime value is incorrect (tc_cpu_ticks() problem)
Message-ID:  <BANLkTimPCE6cc=iM08cWvq6oxUQT_3SK5A@mail.gmail.com>
In-Reply-To: <4E01D4BD.5030809@uffe.org>
References:  <BANLkTikVaKqU9uTtB9nLA1G_mfjKLYuWBg@mail.gmail.com> <4E01D4BD.5030809@uffe.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jun 22, 2011 at 1:40 PM, Uffe Jakobsen <uffe@uffe.org> wrote:
>
>
> On 2011-06-22 12:33, Svatopluk Kraus wrote:
>>
>> Hi,
>>
>> =A0 I've tested FreeBSD-current from June 16 2011 on x86 (AMD Elan
>> SC400). I found out that a sum of runtimes of all threads is about 120
>> minutes after 180 minutes of system uptime and the difference is
>> getting worse with time. The problem is in tc_cpu_ticks()
>> implementation which takes into acount just one timecounter overflow,
>> but in tested BSP (16-bit hardware counter) very often more than one
>> overflow occured between two tc_cpu_ticks() calls.
>>
>> =A0 I understand that 16-bit timecounter is a real relict nowadays, but
>> I would like to solve the problem somehow reasonably. I have a few
>> questions.
>>
>> =A0 According to description in definition of timecounter structure
>> (sys/timetc.h), tc_get_timecount() should read the counter and
>> tc_counter_mask should mask off any unimplemented bits. In
>> tc_cpu_ticks(), if ticks count returned from tc_get_timecount()
>> overflows then (tc_counter_mask + 1) is added to result.
>>
>> =A0 However, timecounter hardware can be initialized to value from
>> interval (0, tc_counter_mask>, so if the description of
>> tc_get_timecount() doesn't lie then adding (tc_counter_mask + 1) value
>> at all times is not correct. Better description which satisfies
>> tc_cpu_ticks() implementation is that tc_get_timecount() should count
>> the ticks in interval<0, tc_counter_mask>. That's what
>> i8254_get_timecount() (in sys/x86/isa/clock.c) does really. However,
>> if tc_get_timecount() should count the ticks (and doesn't read the
>> counter) then it can count the ticks in full uint64_t range? And
>> tc_cpu_ticks() implementation could be very simple (not masking, not
>> overflow checking). In i8254_get_timecount(), it is enough to change
>> global variable 'i8254_offset' and local variable 'count' from
>> uint16_t to uint64_t type.
>>
>> =A0 Now, cpu_ticks() (whichs point to tc_cpu_ticks() by default) is
>> called from mi_switch() which must be called often enough to satisfy
>> tc_cpu_ticks() implementation (recognize just one timecounter
>> overflow). That limits some of system parameters (at least hz
>> selection).
>>
>> =A0 It looks that tc_counter_mask is a little bit misused?
>>
>> =A0 Maybe, tc_cpu_ticks() is only used for back compatibility and new
>> system should use set_cputicker() to change this default?
>>
>> =A0 Thanks for some help to better understand that.
>>
>
> I'm by no means an expert in this field - but your mentioning of AMD Elan
> SC400 triggered some old knowledge about the AMD Elan SC520.
>
> If you have a look at the sys/i386/i386/elan-mmcr.c
>
> Function "init_AMD_Elan_sc520()" adresses the fact that the i8254 has a
> nonstandard frequency with the AMD Elan SC520 at least - could it be the
> same with the SC400 ?

You are correct, AMD Elan SC400 i8254 has nonstandard frequency, but
it's not the problem. After system startup, no new threads start and
no threads exit, but sum of runtimes of all existing  threads is much
much less than system uptime and the difference is worse with time.
Only one timecounter in system. System uptime is correct and respons
to time measured by my watch.

Svata



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTimPCE6cc=iM08cWvq6oxUQT_3SK5A>