From owner-freebsd-current  Fri Oct 20 22:16:52 1995
Return-Path: owner-current
Received: (from root@localhost)
          by freefall.freebsd.org (8.6.12/8.6.6) id WAA03853
          for current-outgoing; Fri, 20 Oct 1995 22:16:52 -0700
Received: from relay1.UU.NET (relay1.UU.NET [192.48.96.5])
          by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id WAA03845
          for <freebsd-current@FreeBSD.org>; Fri, 20 Oct 1995 22:16:45 -0700
Received: from ast.com by relay1.UU.NET with SMTP 
	id QQzmjd01551; Sat, 21 Oct 1995 01:16:07 -0400 (EDT)
Received: from trsvax.fw.ast.com (fw.ast.com) by ast.com with SMTP id AA01038
  (5.67b/IDA-1.5); Fri, 20 Oct 1995 22:17:34 -0700
Received: by trsvax.fw.ast.com (/\=-/\ Smail3.1.18.1 #18.1)
	id <m0t6WCM-00002EC@trsvax.fw.ast.com>; Sat, 21 Oct 95 00:10 CDT
Received: by nemesis.lonestar.org (Smail3.1.27.1 #19)
	id m0t6W9B-000IxkC; Sat, 21 Oct 95 00:07 WET DST
Message-Id: <m0t6W9B-000IxkC@nemesis.lonestar.org>
Date: Sat, 21 Oct 95 00:07 WET DST
To: wollman@lcs.mit.edu, joerg_wunsch@uriah.heep.sax.de,
        freebsd-current@FreeBSD.org
From: uhclem%nemesis@fw.ast.com (Frank Durda IV)
Sent: Sat Oct 21 1995, 00:07:05 CDT
Subject: Re: clock running faster?
Sender: owner-current@FreeBSD.org
Precedence: bulk

[0]As Bruce Evans wrote:
[0]The changes have the effect of making the Pentium clock the reference.

[1]On Thu, 19 Oct 1995 23:57:17 +0100 (MET), J Wunsch <j@uriah.heep.sax.de>
[1]said:
[1]I don't think this has been a good idea.

[2]"Garrett A. Wollman" <wollman@lcs.mit.edu> then wrote:
[2]Please explain your reason for believing this, keeping in mind that a
[2]similar technique was used on MicroVAXen in 4.3 (except using
[2]microtime() rather than a purpose-built function).

Yes, and the VAX 11/780 did this too.  But all you had to do was walk
up to the console, halt the KL (or whatever the main processor
designation was), and type SET CLOCK FAST, and then let the processor
continue.  Now the 780 is executing instructions about 20% faster than
normal, but guess what?  Despite having that nice TOD hardware down there,
the BSD 4.x system wall time clocks is now running fast, gaining a lot per
hour.  (VMS used the TOD hardware so this didn't happen when running VMS.)

We used to use this extra burst of speed when we really needed some
big system build to get done faster.  Just halt, change speeds and
resume, no need to reboot BSD!  Just a slight pause.  Of course, everybody
would complain about the uptime and TOD clocks not agreeing with wall
clocks or with each other so we didn't do it very often.

Granted the SET CLOCK FAST command mechanism on the VAX was meant to be
used by CEs to detect flaky components that were on the edge of failing,
but guess what?  Modern PC makers "cheat" on the processor clock too,
and don't always advertise the fact.

In fact, in this day of one-main-logic-board-fits-all, the processor clock
is probably generated by a clock synth chip, that may be allowed to
be as much as 5% fast or slow on a given machine AND on a given boot cycle,
as the synth speed always starts up at a very slow speed (8 to 20MHz).
Crystals are usually a lot tighter on deviation from the marked frequencu,
and they hold the same value from one boot to the next, but crystals are
becoming far less common on motherboards.

Further, I can name several major PC vendors who deliberately program
"high" settings into the synth chips so that their machines will perform
better in benchmarks, and the systems are sold with these faster values.
These values are loaded by the BIOS after POST is complete.  Since Intel
(and other processor makers) have tested their components clocked
at up to 10% faster than the listed value, this 5% increase isn't a
problem, particularly if the cooling of the processor is under control.
The chip makers don't like it when computer makers do this, and the chip
makers will use it an excuse if the computer manufacturer runs into
reliability trouble, but this rarely happens.

I can even name a couple of Triton-based Intel-built boards out there
in the market right now where this trick is done so that Vendor A can say
their system is faster than Vendor P, even though they both have the exact
same Intel motherboard inside.

But this "bit of extra speed" can really mess up time-keeping that
is based on processor speed, particularly if events skew the routine
or routines that estimate processor speed.  This gets worse if the
timing routine tries to rationalize its results to "known" processor
speeds, such as 33, 50, 66, 75, 90, 100, 120, 133, etc.  If the number
comes up as 92, the code might simply report 90, and do all the clock
math based on 90, which will be wrong.  But there are other
things that can mess up the estimate.

Note that like VMS, DOS and Windows don't do this - they use the system
timers and the CMOS clock for TOD duties.

Now I will be the first to admit that the 14.31818MHz (4 x NTSC color burst
frequency - what a choice) clock that the traditional PC system clock is
based from (although a multiple of this is used frequently in newer machines)
isn't that easy to work with, at least it is consistent on all PCs, and
even a 200ppm error in a given crystal will still be a pretty small drift
in the system clock, once you get down to ~18.2/sec or ~100/sec or whatever
you have it programmed to.  (This clock has to be reasonably accurate or
else RS-232 operations will suffer from bit errors.)

If a processor "tight loop" with interrupts disabled is used to determine
the processor clock speed (I suspect this is the case), such a routine can
be fooled by the methods some chipsets use for performing refresh, which
instead of being spaced at nice even intervals like was done years ago,
now the refresh cycles may come in groups, or are eliminated based on recent
Read/Write accesses.  These so-called "Smart" refresh managers will
completely fool a timing routine, producing one set of values on one
motherboard brand machine and a different set of values on a different
motherboard model.   The results can also vary depending on what memory
locations the loop resides in.

Keep in mind that nearly 100% of the DRAM out there simply requires that
each of the 256 rows be refreshed once every 16msec.   There is
nothing that says the processor must take that 16msec and divide it by
256, and then refresh one row at even intervals.  All 256 rows can
be done at once, or at irregular intervals.

Since a refresh cycle is simply an incomplete memory read cycle
(RAS but no CAS), the act of fetching instructions, reading and
writing memory all effectively perform refreshes, but not in a uniform
fashion.  Old systems made no attempt to take advantage of the fact
the processor had accessed given memory rows recently, but many bus control
chipsets made since 1990 do, keeping "decay" counters that alert it to
when a row needs refreshing, and these counter are reset by the processor
accessing that row as part of one of its memory operations, a DMA cycle,
or a genuine refresh cycle.

This somewhat unpredictable refresh activity could probably explain 
systems that people boot ten times get ten different processor speed
ratings for code that should be running "uninterruptable".  

This type of computation error could be reduced by greatly increasing the
duration of the timing test, but that may not be desirable.

Even if you argue that the complete timing test should end up being
executed from the processor cache and pipeline, since the full internal
workings of the Pentium (or its Pro buddy) are not public, we can't be
sure that residual memory writes from earlier operations aren't performed
during the actual timing test, or that memory reads for pipeline and
cache fills aren't occurring during the test.  There isn't any good
way to control this without a full internal knowledge of how and
when the processor tries to access memory, and what impact DMA
and refresh bus grants have on the processor when it is running
code already in the processor.  And of course, not everybody has Pentiums,
and no one except knows if all versions of the Pentium behave the
same in this area.

Then you have the situation where the CPU clock is not symmetrical and/or
a multiple of the ISA bus clock, or of the clock that is used within
the ISA chipset for things like timers, interrupt controllers, etc.
This difference in timing can cause variable length of wait states to appear
when attempting to access hardware registers that are not actually
running at the full 66, 90 or whatever MHz speed the CPU is running at.
These speed differences can also induce variable latencies in reporting
interrupts to the CPU, particularly in systems that serialize IRQs to
save on pins on the package.


Bottom line, I also agree that estimating CPU speed and using that as
the TOD clock is a bad idea.  There are just too many variables to
make this work well on all of the systems out there.


						(Ask before SGMLing)
Frank Durda IV <uhclem@nemesis.lonestar.org>|"The Knights who say "LETNi"
or uhclem%nemesis@fw.ast.com (Fastest Route)| demand...  A SEGMENT REGISTER!!!"
...letni!rwsys!nemesis!uhclem               |"A what?"
...decvax!fw.ast.com!nemesis!uhclem         |"LETNi! LETNi! LETNi!"  - 1983
Microsoft, you can add this to your Developer Network CD in exchange for a
three-year Level 2 license.