From owner-svn-src-head@freebsd.org Wed Mar 23 07:58:48 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6A34DAD8D34; Wed, 23 Mar 2016 07:58:48 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1369113A1; Wed, 23 Mar 2016 07:58:47 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u2N7wgae064876 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 23 Mar 2016 09:58:43 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u2N7wgae064876 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u2N7wgLf064875; Wed, 23 Mar 2016 09:58:42 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 23 Mar 2016 09:58:42 +0200 From: Konstantin Belousov To: John Baldwin Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r297039 - head/sys/x86/x86 Message-ID: <20160323075842.GX1741@kib.kiev.ua> References: <201603181948.u2IJmndg063765@repo.freebsd.org> <20160319032216.GZ1741@kib.kiev.ua> <1866602.Bp7VFd5f42@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1866602.Bp7VFd5f42@ralph.baldwin.cx> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Mar 2016 07:58:48 -0000 On Mon, Mar 21, 2016 at 11:12:57AM -0700, John Baldwin wrote: > On Saturday, March 19, 2016 05:22:16 AM Konstantin Belousov wrote: > > On Fri, Mar 18, 2016 at 07:48:49PM +0000, John Baldwin wrote: > > > > > > - for (x = 0; x < delay; x += 5) { > > > + for (x = 0; x < delay; x++) { > > > if ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) == > > > APIC_DELSTAT_IDLE) > > > return (1); > > > - DELAY(5); > > > + DELAY(1); > > > } > > > return (0); > > > } > > > > Ideally we would structure the loop differently. I think it is more > > efficient WRT latency to only block execution by ia32_pause(), and > > compare the the getbinuptime() results to calculate spent time, on each > > loop step. > > Yes. I've thought about using the TSC directly to do that, but folks are > worried about the TSC being unstable due to vcpus in a VM migrating > across physical CPUs. DELAY() does seem to DTRT in that case assuming the > hypervisor doesn't advertise an invariant TSC via cpuid. We'd have to > essentially duplicate DELAY() (really delay_tc()) inline. If TSC has the behaviour you described, i.e. suddenly jumping random steps on single CPU, from the point of view of kernel, then the system is seriosly misbehaving. The timekeeping stuff would be badly broken regardless of the ipi_wait(). I do not see why should we worry about that in ipi_wait(). I proposed slightly different thing, i.e. using timekeep code to indirect to TSC if configured so. Below is the proof of concept patch, use of nanouptime() may be too naive, and binuptime() would cause tiny bit less overhead, but I do not want to think about arithmetic. diff --git a/sys/x86/x86/local_apic.c b/sys/x86/x86/local_apic.c index 7e5087b..fdc12a4 100644 --- a/sys/x86/x86/local_apic.c +++ b/sys/x86/x86/local_apic.c @@ -1621,31 +1621,37 @@ SYSINIT(apic_setup_io, SI_SUB_INTR, SI_ORDER_THIRD, apic_setup_io, NULL); * private to the MD code. The public interface for the rest of the * kernel is defined in mp_machdep.c. */ + +/* + * Wait delay microseconds for IPI to be sent. If delay is -1, we + * wait forever. + */ static int native_lapic_ipi_wait(int delay) { - int x; + struct timespec end, x; /* LAPIC_ICR.APIC_DELSTAT_MASK is undefined in x2APIC mode */ if (x2apic_mode) return (1); - /* - * Wait delay microseconds for IPI to be sent. If delay is - * -1, we wait forever. - */ - if (delay == -1) { - while ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) != - APIC_DELSTAT_IDLE) - ia32_pause(); - return (1); + if (delay != -1) { + KASSERT(delay > 0, ("wrong delay %d", delay)); + x.tv_sec = delay / 1000000; + x.tv_nsec = (delay % 1000000) * 1000; + nanouptime(&end); + timespecadd(&end, &x); } - - for (x = 0; x < delay; x++) { + for (;;) { if ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) == APIC_DELSTAT_IDLE) return (1); - DELAY(1); + ia32_pause(); + if (delay != -1) { + nanouptime(&x); + if (timespeccmp(&x, &end, >)) + break; + } } return (0); }