Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 May 2011 15:47:49 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Jung-uk Kim <jkim@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r221703 - in head/sys: amd64/include i386/include x86/isa x86/x86
Message-ID:  <4DCD2875.9040808@FreeBSD.org>
In-Reply-To: <201105121239.31340.jkim@FreeBSD.org>
References:  <201105091734.p49HY0P3006180@svn.freebsd.org> <4DCBC316.5030209@FreeBSD.org> <201105121239.31340.jkim@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 12/05/2011 19:39 Jung-uk Kim said the following:
> On Thursday 12 May 2011 07:23 am, Andriy Gapon wrote:
>> on 09/05/2011 20:34 Jung-uk Kim said the following:
>>> Author: jkim
>>> Date: Mon May  9 17:34:00 2011
>>> New Revision: 221703
>>> URL: http://svn.freebsd.org/changeset/base/221703
>>
>> [snip]
>>
>> I would to note [again] that I don't like code style of this
>> change.
>>
>>> Modified: head/sys/x86/x86/tsc.c
>>> =================================================================
>>> ============= --- head/sys/x86/x86/tsc.c	Mon May  9 17:30:25
>>> 2011	(r221702) +++ head/sys/x86/x86/tsc.c	Mon May  9 17:34:00
>>> 2011	(r221703) @@ -326,7 +326,73 @@ init_TSC(void)
>>>  	    tsc_levels_changed, NULL, EVENTHANDLER_PRI_ANY);
>>>  }
>>>
>>> -void
>>> +#ifdef SMP
>>> +
>>> +#define	TSC_READ(x)			\
>>> +static void				\
>>> +tsc_read_##x(void *arg)			\
>>> +{					\
>>> +	uint32_t *tsc = arg;		\
>>> +	u_int cpu = PCPU_GET(cpuid);	\
>>> +					\
>>> +	tsc[cpu * 3 + x] = rdtsc32();	\
>>> +}
>>> +TSC_READ(0)
>>> +TSC_READ(1)
>>> +TSC_READ(2)
>>> +#undef TSC_READ
>>> +
>>> +#define	N	1000
>>
>> I don't like macro overuse in the above.
>> Especially "N".
> 
> I think duplicating trivial and almost identical functions is more 
> harmful to many eyes.
> 
> 'N' here is temporary and arbitrary.  I think 1,000 is actually 
> overkill but I wanted to "stress-test" smp_rendezvous(). :-P

I guess I need to try to write this in "my way" and see if I would be able to come
up with anything more appealing :-)

> Now that you found a bug in the function, I can remove N but I want to 
> keep it for a while.  Actually, it was already proven useful, i.e., I 
> gave a user the following patch:
> 
> -#define	N	1000
> +#define	N	100

Well, I think this could be a static const variable or a tunable, but not a macro.

> and it worked around his problem for now (until you commit the 
> smp_rendezvous() fix).

I need more testers/reviewers for this.

>>> +static void
>>> +comp_smp_tsc(void *arg)
>>> +{
>>> +	uint32_t *tsc;
>>> +	int32_t d1, d2;
>>> +	u_int cpu = PCPU_GET(cpuid);
>>> +	u_int i, j, size;
>>> +
>>> +	size = (mp_maxid + 1) * 3;
>>> +	for (i = 0, tsc = arg; i < N; i++, tsc += size)
>>> +		CPU_FOREACH(j) {
>>> +			if (j == cpu)
>>> +				continue;
>>> +			d1 = tsc[cpu * 3 + 1] - tsc[j * 3];
>>> +			d2 = tsc[cpu * 3 + 2] - tsc[j * 3 + 1];
>>> +			if (d1 <= 0 || d2 <= 0) {
>>> +				smp_tsc = 0;
>>> +				return;
>>> +			}
>>> +		}
>>> +}
>>> +
>>> +static int
>>> +test_smp_tsc(void)
>>> +{
>>> +	uint32_t *data, *tsc;
>>> +	u_int i, size;
>>> +
>>> +	if (!smp_tsc && !tsc_is_invariant)
>>> +		return (-100);
>>> +	size = (mp_maxid + 1) * 3;
>>> +	data = malloc(sizeof(*data) * size * N, M_TEMP, M_WAITOK);
>>> +	for (i = 0, tsc = data; i < N; i++, tsc += size)
>>> +		smp_rendezvous(tsc_read_0, tsc_read_1, tsc_read_2, tsc);
>>
>> I don't like that what is logically a two dimensional array 3 x
>> (mp_maxid + 1) is represented as a one-dimensional array with all
>> ensuing multiplications by three and other array index
>> manipulations.
> 
> (As I said earlier) I agree with you in general but using 
> two-dimensional array or array of pointers just added more complexity 
> to the code and it didn't make the code any clearer than I originally 
> thought.  Again, it is just a matter of taste, anyway. ;-)

I guess this is the same as above, until I'll try to actually do something I won't
be sure whether this could be done any better (for my taste) :-)

>>> +	smp_tsc = 1;	/* XXX */
>>> +	smp_rendezvous(smp_no_rendevous_barrier, comp_smp_tsc,
>>> +	    smp_no_rendevous_barrier, data);
>>> +	free(data, M_TEMP);
>>> +	if (bootverbose)
>>> +		printf("SMP: %sed TSC synchronization test\n",
>>> +		    smp_tsc ? "pass" : "fail");
>>> +	return (smp_tsc ? 800 : -100);
>>
>> I still think something higher should be returned here for the
>> smp_tsc == true case.  It doesn't make sense to go through the
>> shenanigans to underrate TSC in the end.
> 
> http://docs.freebsd.org/cgi/mid.cgi?201105091352.49971.jkim
> 
> I thought you wanted a separate commit, didn't you?

I wanted HPET and ACPI quality to be changed in a separate commit, but I wanted
TSC quality (for good TSC) to be bumped in this commit.

>> On a more general note, smp_rendezvous() might not be the best
>> mechanism here. In my opinion/understanding, smp_rendezvous()
>> provides only the following guarantees:
>> - if a setup action is specified, then every CPU executes the setup
>> action before any CPU executes the main action
>> - if a teardown action is specified, then every CPU executes the
>> main action before any CPU executes the teardown action
>>
>> There are no timing guarantees, only the sequence guarantees.
> 
> That's all I wanted to test, i.e., we only validate whether all TSCs 
> are in order (and that's all we can do, in fact).

Well, not sure about that.
E.g. the OpenSolaris code and the code that I derived from it compare difference
between measured TSC value with how long it takes for a memory write by one CPU to
be noticed by other CPU.  Not without some heuristic and assumptions, of course.

>> Any timing observations that we may have now are a product of the
>> implementation and can change if the implementation change.  So the
>> newly introduced check may miss systemic differences in TSC values
>> between CPUs if they are small enough. But I am not really sure if
>> such a small differences would really matter.
>>
>> Worse case if there is some difference in TSC frequency between
>> CPUs (e.g. in different sockets), after a powerup or a reset
>> difference between TSC values may be very small, but could grow
>> more and more with uptime.  Not sure if this is a realistic enough
>> scenario, though.
> 
> Actually, I am kinda reluctant to enable smp_tsc by default on recent 
> CPUs.  Although they made all TSCs in sync, it is very very tricky to 
> make it work in reality, e.g.,
> 
> https://patchwork.kernel.org/patch/691712/

I am not sure what is their concern there.
TSC is good to be used as timecounter.  If they use TSC for performance
measurements, then of course they have to use some barriers - this is well known
and documented.  BTW, newer CPUs provide RDTSCP instruction which could be more
convenient there.


-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DCD2875.9040808>