Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Apr 2008 12:30:30 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        freebsd-current@freebsd.org
Subject:   Re: TSC Timecounter and multi-core/SMP
Message-ID:  <200804181930.m3IJUUYx026599@apollo.backplane.com>
References:  <51610.1208498408@critter.freebsd.dk> <4808E06D.8020304@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
:How hard can it be?
:
:An instruction that gives a 64 bit counter, in some reasonable
:granularity that is run at the same speed for all CPUS in a system
:regardless of the speed each cpu is running..
:While nsecs would be nice even usecs might do.
:They don't even have to be in sync as long as the offset
:between them is constant (though that would be nice).
:Bonus points for being able to read it from user space. The
:hardware people don't seem to realise the importance
:of this. and keep throwing it out to gain/save a pin or to save
:some transistors for some other feature.

    I think it's harder then it sounds.  The technology isn't difficult,
    the problem is the two requirements people seem to have for a solid
    time base these days:

    * Fast access time (in-instruction-stream)
    * High resolution (~1nS)
    * Not eat up a bunch of die area or current

    What it comes down to, really, is simply the fact that you can't just
    generate an independant time source at a fixed frequency, use it to
    drive a counter, and then latch it into the cpu without synchronizing
    it to the cpu's internal clock.  Latches are highly sensitive to input
    changes that occur simultaniously with the latching clock.  I'd have
    to research the actual gate configuration AMD and Intel use but
    basically you can wind up with either a full-blown latch-up condition,
    where the latch tries to drive both a 1 and a 0 (resulting in a short),
    or you can create an oscillation or other indeterminant state for a
    short while which can propogate onto the cpu's internal busses and
    would be really bad news (or at least result in occassional garbage
    when trying to read the counter).  The very last thing you want to have
    to do is resynchronize 64 bits in parallel, which means the actual
    counter would have to be implemented in the cpu's core logic and be
    synchronized to the cpu's core frequency.

    One solution is to place the counter on a bus which is able to
    resynchronize the data flow, such as a hyper-transport bus.
    But of course if you do that your 'RDTSC' equivalent is going to take
    more then a few cycles to run.

    If one didn't mind foregoing the high resolution requirement then
    the problem is greatly simplified... an external time base, such as
    a 1-30 MHz crystal, can be fed into just one bit's worth of 
    resynchronization logic to generate counter pulses at the cpu's operating
    frequency and the counter can then be implemented inside the cpu,
    synchronized to its operating frequency.  THAT could be done
    very easily, and virtually no cost in die area or current.  The
    timer would have to run at 1/2 the frequency of the cpu's lowest
    frequency operating state, which could be very low indeed.

    It kind of turns into a mess no matter how you twist it, as long as
    the 'fast access time' requirement is left in place.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200804181930.m3IJUUYx026599>