Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Apr 2009 00:24:25 +0200
From:      Juergen Lock <nox@jelal.kn-bremen.de>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        kalinoj1@iem.pw.edu.pl, freebsd-emulation@FreeBSD.org
Subject:   Re: Recent qemu and timers issue
Message-ID:  <20090427222425.GA32342@triton.kn-bremen.de>
In-Reply-To: <20090427182336.K64097@delplex.bde.org>
References:  <200904032223.n33MNTiq019599@triton.kn-bremen.de> <200904072137.n37LbbdC071227@triton.kn-bremen.de> <20090423214701.GA83621@triton.kn-bremen.de> <20090424201623.N887@besplex.bde.org> <20090426184021.GA9545@triton.kn-bremen.de> <20090427182336.K64097@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 27, 2009 at 07:11:50PM +1000, Bruce Evans wrote:
> On Sun, 26 Apr 2009, Juergen Lock wrote:
> 
> > On Fri, Apr 24, 2009 at 10:20:33PM +1000, Bruce Evans wrote:
> >> On Thu, 23 Apr 2009, Juergen Lock wrote:
> >>
> >>> On Tue, Apr 07, 2009 at 11:37:37PM +0200, Juergen Lock wrote:
> >>>> In article <200904062254.37824.kalinoj1@iem.pw.edu.pl> you write:
> >>>>> Dnia sobota 04 kwietnia 2009 o 00:23:29 Juergen Lock napisa=C5=82(a):
> >>>>>> In article <c948bb4de85d1b2a340ac63a7c46f6d9@iem.pw.edu.pl> you write:
> >>>>> ...
> >>>>>>> I tried to use all possible timers using sysctl, where I have:
> >>>>>>> TSC(800) HPET(900) ACPI-safe(850) i8254(0) dummy(-1000000)
> >>>>>>> None of these helped.
> >>
> >> None of these are normally used for calculating runtimes.  Normally
> >> on i386, the TSC is used.
> >
> > Aaah-haa, this I didn't know.
> >
> >>  The only way to configure this is to edit
> >> the source code.  Try removing the calls to set_cputicker() in the MD
> >> code.  Then the MI and timecounter-based cputicker tc_cpu_ticks() will
> >> be used.
> >
> > Yup, that seemed to help indeed. (patch below.)
> >
> >>  A better implementation would use a user-selectable
> >> timecounter-based cputicker in all cases, but usually not the system
> >> timecounter since that is likely to be very slow so as to be more
> >> accurate.
> >>
> > This was using qemu's emulated hpet...  I guess you mean slow to read
> > the counter value?  How often is the cputicker read, at every context
> > switch?  More often?
> 
> Yes, ACPI timecounter hardware typically takes 1000 nsec to read, while
> TSC hardware typically takes 5 nsec to read (12 cycles on AthlonXP and
> Athlon64; more on P3-4, Core2 and Phenom).  I don't know how long it
> takes to read a typical HPET.  Emulated timecounter hardware is likely to
> be even slower.

Yup.  If I'm not mistaken its handled like any other memorymapped io
access in qemu, causing kqemu kernel code to return to userland qemu
to handle it and passing the result back to the guest running in kqemu...
(Well, with -kernel-kqemu anyway, with regular userland kqemu mode the
guest kernel code that does the io still runs in jit mode i.e. simulation.)

>  Timecounter software typically adds only another 20
> (50?) nsec.  The cputicker is read mainly at every context switch.
> 
> >> [...some fixes]
> >>
> > ...and I tried this, both changes didn't fix the problem.
> >
> >> Another thing you can try here is to edit the source code to change
> >> the set_cputicker() calls to say that the frequency is not variable.
> >
> > That probably won't help here because I noticed at least the initial
> > tsc `calibration' in the guest (in init_TSC()) is way off too (it got
> > not even half the value here of the actual frequency, which according
> > to dmesg on this host is `TSC: P-state invariant'.)
> 
> The initial calibration code is even sloppier than the recalibration,
> and is more likely not to work under emulation.  It depends on the
> i8254 timer being accurate and doesn't try to sandwich reads of the
> TSC between close-together reads of the reference timer or otherwise
> try to limit errors in reading the reference timer.  With real hardware
> this normally causes an avoidable error of at most 5 ppm (from waiting
> 5 i8254 cycles extra), but with emulated hardware it probably causes
> a larger error even if the emulation is perfect.  The recalibration
> does better by using a higher quality reference timer sampled over an
> interval 16 times as long.
> 
 Yup, the recalibrated value is much more correct, as can be seen by
printing cpu_tick_frequency in kgdb.  (And this recalibration seemed
to have caused at least one of the `runtime went backwards' messages
once it did fit in the 16 +/- 1/256 seconds range of cpu_tick_calibrate()
i.e. when it actually updated cpu_tick_frequency, altho I did get another
one of those messages later on.)

> This should be fixable using the machdep.tsc_freq sysctl.  However,
> this sysctl neglects to call set_cputicker().  This should make
> little difference when the frequency is nominated as variable since
> recalibration should change it soon anyway.  However, the bug in
> recalibration prevents downwards adjustments.
> 
> > OK _maybe_ if we get the proper frequency into the guest there somehow
> > from the beginning and then say its not variable maybe it could work,
> > but that still leaves the case of hosts with non P-state invariant tsc
> > because...
> >
> >> I used this temporarily to work around the non-decreasing calibration.
> >> This should be the default for emulators for most cputickers -- emulators
> >> should emulate a constant frequency and not emulate the complexities
> >> for pwoer saving.
> >
> > Hmm I guess thats more easily said than done. :)  At least qemu
> > basically just passes the host tsc thru when a guest reads it.
> 
> But it claims P-state invariance?  Maybe it gets that from the host.

 No, that message was from the host, the guest didn't say that.

> Does it trap TSC reads?

 No.  (Well, only without kqemu i.e. when its doing jit, and then it
still uses the host tsc, altho it may add an offset.)

>  This would be slow, but required to emulate
> P-state invariance and might be required for accurate timing anyway.
> I think emulators shouldn't trap reads of the TSC because the TSC
> is unreliable for accurate timing anyway, but they should do something
> to keep slower-to-access accurate hardware timers virtually accurate.
> Hopefully the hardware people will eventually make a timer like the
> TSC both accurate and fast.  Emulators will have a difficult time
> preserving both.

 Agreed.

 Cheers,
	Juergen



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090427222425.GA32342>