Date: Sun, 27 Mar 2011 23:14:22 -0600 From: Warner Losh <imp@bsdimp.com> To: Julian Elischer <julian@FreeBSD.org> Cc: kostikbel@gmail.com, freebsd-hackers@FreeBSD.org, John Baldwin <jhb@FreeBSD.org>, Jing Huang <jing.huang.pku@gmail.com> Subject: Re: [GSoc] Timeconter Performance Improvements Message-ID: <F34E75DB-F401-4648-96CB-7B7F5D4E4CEB@bsdimp.com> In-Reply-To: <4D900EB4.2050500@freebsd.org> References: <AANLkTimbBohQmoPv19Qq2U6M70OBx%2BFBMiUAzQmqrTLK@mail.gmail.com> <201103250818.38470.jhb@freebsd.org> <20110326121646.GA2367@server.vk2pj.dyndns.org> <201103261012.32884.jhb@freebsd.org> <AANLkTimjj6dimyoY1K4xKabiNeAMjSt-YXjFpdaTJCTr@mail.gmail.com> <703A54EA-3C99-4BAF-923B-91B50BFFC748@bsdimp.com> <4D900EB4.2050500@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mar 27, 2011, at 10:29 PM, Julian Elischer wrote:
> On 3/27/11 3:32 PM, Warner Losh wrote:
>> On Mar 26, 2011, at 8:43 AM, Jing Huang wrote:
>>=20
>>> Hi,
>>>=20
>>> Thanks for you all sincerely. Under your guidance, I read the
>>> specification of TSC in Intel Manual and learned the hardware =
feature
>>> of TSC:
>>>=20
>>> Processor families increment the time-stamp counter differently:
>>> =95 For Pentium M processors (family [06H], models [09H, 0DH]); =
for Pentium 4
>>> processors, Intel Xeon processors (family [0FH], models [00H, 01H, =
or 02H]);
>>> and for P6 family processors: the time-stamp counter increments with =
every
>>> internal processor clock cycle.
>>>=20
>>> =95 For Pentium 4 processors, Intel Xeon processors (family [0FH],
>>> models [03H and
>>> higher]); for Intel Core Solo and Intel Core Duo processors (family =
[06H], model
>>> [0EH]); for the Intel Xeon processor 5100 series and Intel Core 2 =
Duo processors
>>> (family [06H], model [0FH]); for Intel Core 2 and Intel Xeon =
processors (family
>>> [06H], display_model [17H]); for Intel Atom processors (family =
[06H],
>>> display_model [1CH]): the time-stamp counter increments at a =
constant rate.
>>>=20
>>> Maybe we would implement gettimeofday as fellows. Firstly, use cpuid
>>> to find the family and models of current CPU. If the CPU support
>>> constant TSC, we look up the shared page and calculate the precise
>>> time in usermode. If the platform has invariant TSCs, and we just
>>> fallback to a syscall. So, I think a single global shared page maybe
>>> proper.
>> I think that the userspace portion should be more like:
>>=20
>> int kernel_time_type) SECTION(shared);
>> struct tsc_goo tsc_time_data SECTION(shared);
>>=20
>> switch (kernel_time_type) {
>> case 1:
>> /* code to use tsc_time_data to return time */
>> break;
>> default:
>> /* call the kernel */
>> }
>>=20
>> I think we should avoid hard-coding lists of CPU families in =
userland. The kernel init routines will decide, based on the CPU type =
and other stuff if this optimization can be done. This would allow the =
kernel to update to support new CPU types without needing to churn libc.
>>=20
>> Warner
>>=20
>> P.S. The SECTION(shared) notation above just means that the =
variables are in the shared page.
>=20
> As has been mentioned here and there, the gold-standard way for doing =
this is for the kernel to export a special memory region
> in elf format that can be linked to with exported kernel sanctioned =
code snippets specially tailored for the cpu/OS/binray-format
> in question. There is no real security risk to this but potential =
upsides are great.
You'll have to map multiple pages if you do this: one for the data that =
has to be exported from the kernel and one that has to be the executable =
code. I don't think this is necessarily the "gold standard" at all. I =
think it is overkill that we'll grow to regret.
My method you'll have the code 100% in userland, where it belongs. If =
you want to map CPU-type-specific code, add it to ld.so.
Warner
>>>=20
>>> On Sat, Mar 26, 2011 at 10:12 PM, John Baldwin<jhb@freebsd.org> =
wrote:
>>>> On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
>>>>> On 2011-Mar-25 08:18:38 -0400, John Baldwin<jhb@freebsd.org> =
wrote:
>>>>>> For modern Intel CPUs you can just assume that the TSCs are in =
sync across
>>>>>> packages. They also have invariant TSC's meaning that the =
frequency
>>>>>> doesn't change.
>>>>> Synchronised P-state invariant TSCs vastly simplify the problem =
but
>>>>> not everyone has them. Should the fallback be more complexity to
>>>>> support per-CPU TSC counts and varying frequencies or a fallback =
to
>>>>> reading the time via a syscall?
>>>> I think we should just fallback to a syscall in that case. We will =
also need
>>>> to do that if the TSC is not used as the timecounter (or always =
duplicate the
>>>> ntp_adjtime() work we do for the current timecounter for the TSC =
timecounter).
>>>>=20
>>>> Doing this easy case may give us the most bang for the buck, and it =
is also a
>>>> good first milestone. Once that is in place we can decide what the =
value is
>>>> in extending it to support harder variations.
>>>>=20
>>>> One thing we do need to think about is if the shared page should =
just export a
>>>> fixed set of global data, or if it should export routines. The =
latter
>>>> approach is more complex, but it makes the ABI boundary between =
userland and
>>>> the kernel more friendly to future changes. I believe Linux does =
the latter
>>>> approach?
>>>>=20
>>>> --
>>>> John Baldwin
>>>>=20
>>> _______________________________________________
>>> freebsd-hackers@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>>> To unsubscribe, send any mail to =
"freebsd-hackers-unsubscribe@freebsd.org"
>>>=20
>>>=20
>> _______________________________________________
>> freebsd-hackers@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to =
"freebsd-hackers-unsubscribe@freebsd.org"
>>=20
>>=20
>=20
>=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F34E75DB-F401-4648-96CB-7B7F5D4E4CEB>
