Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Mar 2011 13:31:20 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Jung-uk Kim <jkim@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r219646 - head/sys/x86/isa
Message-ID:  <20110315123819.X920@besplex.bde.org>
In-Reply-To: <201103142205.p2EM5x6E012664@svn.freebsd.org>
References:  <201103142205.p2EM5x6E012664@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 14 Mar 2011, Jung-uk Kim wrote:

> Log:
>  When TSC is unavailable, broken or disabled and the current timecounter has
>  better quality than i8254 timer, use it for DELAY(9).

You cannot use a random timecounter for DELAY().  DELAY() is carefully
written to not use any locks, so that it doesn't deadlock when called
in the "any" context from a bad console driver like syscons when syscons
is called from ddb.  (Bad console drivers may have their own locks which
deadlock in ddb, but that is another problem.)  Even the i8254 timer,
which is the only timer that should be used by DELAY(), normally use locks
that used to cause deadlock, but DELAY() was unbroken to bypass the locks
when it is called from ddb.

Cpufreq and other calibration code should use a normal timecounter via
nanouptime() in all cases and never go near low level timecounters or
DELAY().  There are (hopefully minor) problems getting the nanotime()
initialized before it used.  The dummy timecounter is just what you
don't want to use.  Even in this patch, it isn't clear that the low
level timecounters are initialized before they are used.

> Modified: head/sys/x86/isa/clock.c
> ==============================================================================
> --- head/sys/x86/isa/clock.c	Mon Mar 14 19:31:43 2011	(r219645)
> +++ head/sys/x86/isa/clock.c	Mon Mar 14 22:05:59 2011	(r219646)
> @@ -245,6 +245,42 @@ getit(void)
> 	return ((high << 8) | low);
> }
>
> +static __inline void
> +delay_tsc(int n)
> +{
> +	uint64_t start, end, now;
> +
> +	sched_pin();
> +	start = rdtsc();
> +	end = start + (tsc_freq * n) / 1000000;
> +	do {
> +		cpu_spinwait();
> +		now = rdtsc();
> +	} while (now < end || (now > start && end < start));
> +	sched_unpin();
> +}

You cannot call random scheduling code from DELAY(), since the scheduling
code is not designed to be called in the "any" context.  As it happens,
sched_pin() and sched_unpin() are safe, and were already used in the TSC
case.

> +
> +static __inline void
> +delay_timecounter(struct timecounter *tc, int n)
> +{
> +	uint64_t end, now;
> +	u_int last, mask, u;
> +
> +	mask = tc->tc_counter_mask;
> +	last = tc->tc_get_timecount(tc) & mask;
> +	end = tc->tc_frequency * n / 1000000;

This depends on the delicate timecounter locking to be safe.  I think it
it is safe except possibly in tc_get_timecount(), for the same reasons
that nanouptime() can run safely with no visible locking.  tc must be
the current timecounter/timehands, which is guaranteed to not be in an
in-between state or become so while we are using it, despite there being
no visible locking.  Actually there are some minor races:
- in nanouptime(), the timecounter generation is checked to avoid using
   a new generation if necessary, but updates of the generation count and
   the fileds that it protects are not properly atomic.
- here, the check of the generation is missing.  So this works without
   races when called from ddb (since ddb stops other CPUs), but has minor
   races in general.

> +	now = 0;
> +	do {
> +		cpu_spinwait();
> +		u = tc->tc_get_timecount(tc) & mask;
> +		if (u < last)
> +			now += mask - last + u + 1;
> +		else
> +			now += u - last;
> +		last = u;
> +	} while (now < end);
> +}
> +
> /*
>  * Wait "n" microseconds.
>  * Relies on timer 1 counting down from (i8254_freq / hz)
> @@ -253,6 +289,7 @@ getit(void)
> void
> DELAY(int n)
> {
> +	struct timecounter *tc;
> 	int delta, prev_tick, tick, ticks_left;
>
> #ifdef DELAYDEBUG
> @@ -262,16 +299,12 @@ DELAY(int n)
> #endif
>
> 	if (tsc_freq != 0) {
> -		uint64_t start, end, now;
> -
> -		sched_pin();
> -		start = rdtsc();
> -		end = start + (tsc_freq * n) / 1000000;
> -		do {
> -			cpu_spinwait();
> -			now = rdtsc();
> -		} while (now < end || (now > start && end < start));
> -		sched_unpin();
> +		delay_tsc(n);
> +		return;
> +	}
> +	tc = timecounter;
> +	if (tc->tc_quality > 0) {
> +		delay_timecounter(tc, n);
> 		return;
> 	}

Optimizing DELAY() was bogus.  Now the loop for this is duplicatied.

> #ifdef DELAYDEBUG
>

Stuff keeps accumulating or mutating above the code for testing it.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110315123819.X920>