Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Mar 2016 19:15:12 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r297039 - head/sys/x86/x86
Message-ID:  <20160323171512.GA1741@kib.kiev.ua>
In-Reply-To: <20160323075842.GX1741@kib.kiev.ua>
References:  <201603181948.u2IJmndg063765@repo.freebsd.org> <20160319032216.GZ1741@kib.kiev.ua> <1866602.Bp7VFd5f42@ralph.baldwin.cx> <20160323075842.GX1741@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Mar 23, 2016 at 09:58:42AM +0200, Konstantin Belousov wrote:
> If TSC has the behaviour you described, i.e. suddenly jumping random
> steps on single CPU, from the point of view of kernel, then the system
> is seriosly misbehaving.  The timekeeping stuff would be badly broken
> regardless of the ipi_wait().  I do not see why should we worry about
> that in ipi_wait().
> 
> I proposed slightly different thing, i.e. using timekeep code to indirect
> to TSC if configured so.  Below is the proof of concept patch, use of
> nanouptime() may be too naive, and binuptime() would cause tiny bit less
> overhead, but I do not want to think about arithmetic.
> 
> diff --git a/sys/x86/x86/local_apic.c b/sys/x86/x86/local_apic.c
> index 7e5087b..fdc12a4 100644
> --- a/sys/x86/x86/local_apic.c
> +++ b/sys/x86/x86/local_apic.c
> @@ -1621,31 +1621,37 @@ SYSINIT(apic_setup_io, SI_SUB_INTR, SI_ORDER_THIRD, apic_setup_io, NULL);
>   * private to the MD code.  The public interface for the rest of the
>   * kernel is defined in mp_machdep.c.
>   */
> +
> +/*
> + * Wait delay microseconds for IPI to be sent.  If delay is -1, we
> + * wait forever.
> + */
>  static int
>  native_lapic_ipi_wait(int delay)
>  {
> -	int x;
> +	struct timespec end, x;
>  
>  	/* LAPIC_ICR.APIC_DELSTAT_MASK is undefined in x2APIC mode */
>  	if (x2apic_mode)
>  		return (1);
>  
> -	/*
> -	 * Wait delay microseconds for IPI to be sent.  If delay is
> -	 * -1, we wait forever.
> -	 */
> -	if (delay == -1) {
> -		while ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) !=
> -		    APIC_DELSTAT_IDLE)
> -			ia32_pause();
> -		return (1);
> +	if (delay != -1) {
> +		KASSERT(delay > 0, ("wrong delay %d", delay));
> +		x.tv_sec = delay / 1000000;
> +		x.tv_nsec = (delay % 1000000) * 1000;
> +		nanouptime(&end);
> +		timespecadd(&end, &x);
>  	}
> -
> -	for (x = 0; x < delay; x++) {
> +	for (;;) {
>  		if ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) ==
>  		    APIC_DELSTAT_IDLE)
>  			return (1);
> -		DELAY(1);
> +		ia32_pause();
> +		if (delay != -1) {
> +			nanouptime(&x);
> +			if (timespeccmp(&x, &end, >))
> +				break;
> +		}
>  	}
>  	return (0);
>  }

Of course this cannot work if timecounter takes any locks, even spinlock.
Only TSC or similar lock-less mechanisms are allowed there. Meh.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160323171512.GA1741>