From owner-svn-src-all@freebsd.org Fri Mar 25 08:49:10 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 728C1ADD7D1; Fri, 25 Mar 2016 08:49:10 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 00EC21DAD; Fri, 25 Mar 2016 08:49:09 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u2P8n37F010698 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 25 Mar 2016 10:49:04 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u2P8n37F010698 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u2P8n3aw010697; Fri, 25 Mar 2016 10:49:03 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 25 Mar 2016 10:49:03 +0200 From: Konstantin Belousov To: Bruce Evans Cc: John Baldwin , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org, "'rstone@freebsd.org'" Subject: Re: svn commit: r297039 - head/sys/x86/x86 Message-ID: <20160325084902.GH1741@kib.kiev.ua> References: <201603181948.u2IJmndg063765@repo.freebsd.org> <1866602.Bp7VFd5f42@ralph.baldwin.cx> <20160323075842.GX1741@kib.kiev.ua> <2922763.uITxoCVqGR@ralph.baldwin.cx> <20160324090917.GC1741@kib.kiev.ua> <20160325010649.H898@besplex.bde.org> <20160324162447.GD1741@kib.kiev.ua> <20160325060901.N2059@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160325060901.N2059@besplex.bde.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Mar 2016 08:49:10 -0000 On Fri, Mar 25, 2016 at 07:13:54AM +1100, Bruce Evans wrote: > On Thu, 24 Mar 2016, Konstantin Belousov wrote: [Skipped lock adaptive spinning text for now]. > > >> My systems allow speed variations of about 4000:800 = 5:1 for one CPU and > >> about 50:1 for different CPUs. So the old method gave a variation of up > >> to 50:1. This can be reduced to only 5:1 using the boot-time calibration. > > What do you mean by 'for different CPUs' ? I understand that modern ESS > > can give us CPU frequency between 800-4200MHz, which is what you mean > > by 'for one CPU'. We definitely do not care if 5usec timeout becomes > > 25usecs, since we practically never time-out there at all. > > Yes, I actually get 4400:800 on i4790K. > > The ratio is even larger than that with a hard-coded limit because old > CPUs are much slower than i4790K. I sometimes run a 367 MHz (P2 class) > CPU. It is several times slower than a new CPU at the same clock > frequency, and any throttling would make it even slower. > > 50 times slower means that a reasonable emergency timeout of 60 seconds > becomes 3000 seconds. Local users would get tired of waiting and reset, > and remote users might have to wait. But you do not downclock a machine booted at the 4.0Ghz datasheet clock, down to 367Mhz. For 400Mhz P2 machine, LAPIC would be calibrated at that 400Mhz rate. > There is another thread about early DELAY() using the i8254 not working > to calibrate the TSC. That might be just because DELAY() is interrupted. > DELAY() never bothered to disable interrupts. Its early use for calibrating > the TSC depends on interrupts mostly not happening then. (My version is > a bit more careful, but it still doesn't disable interrupts. It > establishes error bounds provided interrupts are shorter than the i8254 > wrap period.) If the i8254 is virtual, then even disabling interrupts > on the target wouldn't help, since the disabling would only be virtual. Yes, the DELAY() calibration is something I wanted to ask about. Could you, please, take a look at https://reviews.freebsd.org/D5738 there is a code which would benefit from better (re-)calibration. Below is the patch to implement calibration of the ipi_wait() busy loop. On my sandybridge 3.4Ghz, I get the message LAPIC: ipi_wait() us multiplier 37 (r 128652089678 tsc 3392383992) diff --git a/sys/x86/x86/local_apic.c b/sys/x86/x86/local_apic.c index 7e5087b..0842de5 100644 --- a/sys/x86/x86/local_apic.c +++ b/sys/x86/x86/local_apic.c @@ -56,6 +56,7 @@ __FBSDID("$FreeBSD$"); #include #include +#include #include #include #include @@ -162,6 +163,7 @@ int x2apic_mode; int lapic_eoi_suppression; static u_long lapic_timer_divisor; static struct eventtimer lapic_et; +static uint64_t lapic_ipi_wait_mult; SYSCTL_NODE(_hw, OID_AUTO, apic, CTLFLAG_RD, 0, "APIC options"); SYSCTL_INT(_hw_apic, OID_AUTO, x2apic_mode, CTLFLAG_RD, &x2apic_mode, 0, ""); @@ -391,6 +393,7 @@ lvt_mode(struct lapic *la, u_int pin, uint32_t value) static void native_lapic_init(vm_paddr_t addr) { + uint64_t r; uint32_t ver; u_int regs[4]; int i, arat; @@ -484,6 +487,34 @@ native_lapic_init(vm_paddr_t addr) TUNABLE_INT_FETCH("hw.lapic_eoi_suppression", &lapic_eoi_suppression); } + +#define LOOPS 1000000 + /* + * Calibrate the busy loop waiting for IPI ack in xAPIC mode. + * lapic_ipi_wait_mult contains the number of iterations which + * approximately delay execution for 1 microsecond (the + * argument to native_lapic_ipi_wait() is in microseconds). + * + * We assume that TSC is present and already measured. + * Possible TSC frequency jumps are irrelevant to the + * calibration loop below, the CPU clock management code is + * not yet started, and we do not enter sleep states. + */ + KASSERT((cpu_feature & CPUID_TSC) != 0 && tsc_freq != 0, + ("TSC not initialized")); + r = rdtsc(); + for (r = 0; r < LOOPS; r++) { + (void)lapic_read_icr_lo(); + ia32_pause(); + } + r = rdtsc() - r; + lapic_ipi_wait_mult = (r * 1000000) / tsc_freq / LOOPS; + if (bootverbose) { + printf("LAPIC: ipi_wait() us multiplier %jd (r %jd tsc %jd)\n", + (uintmax_t)lapic_ipi_wait_mult, (uintmax_t)r, + (uintmax_t)tsc_freq); + } +#undef LOOPS } /* @@ -1621,31 +1652,26 @@ SYSINIT(apic_setup_io, SI_SUB_INTR, SI_ORDER_THIRD, apic_setup_io, NULL); * private to the MD code. The public interface for the rest of the * kernel is defined in mp_machdep.c. */ + +/* + * Wait delay microseconds for IPI to be sent. If delay is -1, we + * wait forever. + */ static int native_lapic_ipi_wait(int delay) { - int x; + uint64_t i, counter; /* LAPIC_ICR.APIC_DELSTAT_MASK is undefined in x2APIC mode */ - if (x2apic_mode) + if (x2apic_mode || delay == -1) return (1); - /* - * Wait delay microseconds for IPI to be sent. If delay is - * -1, we wait forever. - */ - if (delay == -1) { - while ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) != - APIC_DELSTAT_IDLE) - ia32_pause(); - return (1); - } - - for (x = 0; x < delay; x++) { + counter = lapic_ipi_wait_mult * delay; + for (i = 0; i < counter; i++) { if ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) == APIC_DELSTAT_IDLE) return (1); - DELAY(1); + ia32_pause(); } return (0); }