From owner-svn-src-all@freebsd.org Sat Mar 4 22:13:16 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 79720CF9678; Sat, 4 Mar 2017 22:13:16 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au [211.29.132.80]) by mx1.freebsd.org (Postfix) with ESMTP id 1CE9611D0; Sat, 4 Mar 2017 22:13:15 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c122-106-153-191.carlnfd1.nsw.optusnet.com.au [122.106.153.191]) by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id 896CED64CBC; Sun, 5 Mar 2017 09:13:06 +1100 (AEDT) Date: Sun, 5 Mar 2017 09:13:05 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: John Baldwin cc: Pedro Giffuni , Slawa Olhovchenkov , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r314669 - head/sys/i386/conf In-Reply-To: <2368011.hGEX4V32U5@ralph.baldwin.cx> Message-ID: <20170305075947.K914@besplex.bde.org> References: <201703041504.v24F4HMh023937@repo.freebsd.org> <20170304153228.GM15630@zxy.spb.ru> <2368011.hGEX4V32U5@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=Ca543Pjl c=1 sm=1 tr=0 a=Tj3pCpwHnMupdyZSltBt7Q==:117 a=Tj3pCpwHnMupdyZSltBt7Q==:17 a=kj9zAlcOel0A:10 a=6I5d2MoRAAAA:8 a=XubOLsDANpqQbRyPppkA:9 a=CjuIK1q_8ugA:10 a=IjZwj45LgO3ly-622nXo:22 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Mar 2017 22:13:16 -0000 On Sat, 4 Mar 2017, John Baldwin wrote: > On Saturday, March 04, 2017 10:52:46 AM Pedro Giffuni wrote: >> >> On 03/04/17 10:32, Slawa Olhovchenkov wrote: >>> On Sat, Mar 04, 2017 at 03:04:17PM +0000, Pedro F. Giffuni wrote: >>> >>>> Author: pfg >>>> Date: Sat Mar 4 15:04:17 2017 >>>> New Revision: 314669 >>>> URL: https://svnweb.freebsd.org/changeset/base/314669 >>>> >>>> Log: >>>> Drop i486 from the default i386 GENERIC kernel configuration. >>>> >>>> 80486 production was stopped by Intel on September 2007. Dropping the 486 >>>> configuration option from the GENERIC kernel improves performance >>>> slightly. >>>> >>>> Removing I486_CPU is consistent at this time: we don't support any >>>> processor without a FPU and the PC-98 arch, which frequently involved i486 >>>> CPUs, is also gone so we don't test such platforms anymore. >>> >>> What is realy mean? It means that GENERIC is less generic. >> This means we don't do work-arounds that would be required for raw 486. >> Instead we will use the 586 instructions by default. > > This doesn't change that. The kernel already has runtime tests in place > for new things on 486 and later via cpuid. I486_CPU also used to control optimization of get_cyclecount(), by removing the dynamic test for a TSC if I486_CPU and some other options are _not_ configured. Now the optimization is never done, but a larger pessimization is always done: get_cyclecount(): - amd64 and old i386 without I486_CPU and some others: return inlined rdtsc(). - old i386 with I486_CPU or some others: test inline for a TSC; then if not an old CPU, return inline rdtsc(); else call binuptime() and do bad swizzling - current i386: always call a function through a function pointer; if not an old CPU, this points to non-inline rdtsc(); otherwise, usually call binuptime() and do differently bad swizzling to partially support abuse of get_cyclecount() as a monotonic timestamp. get_cyclecount() was changed to use cpu_ticks() in 2011 (r220347). cddl has over-engineered direct use of the TSC for timestamps because native APIs are under-engineered, but important native places like ktr still abuse get_cyclecount(). get_cyclecount() is mis-engineered on most arches: - arm: on newer CPUs with pmu, it uses large inline code with many function calls that might be inlined; on newer CPUs without pmu, it uses 1 function call that might be inlined; on older CPUs, it calls binuptime() and does differently again bad swizzling - arm64: like amd64 (?) - mips: like amd64 (?), except get_cyclecount() is a macro instead of an inline function. The register for this is apparently not good enough for the cpu ticker and thus not good enough for abusing for monotonic timestamps. The register is apparently only 32 bits. The cpu ticker function counts wrap-arounds to maintain 32 more top bits. - powerpc: like amd64, except the 64-bit result is assembled from 2 32-bit registers. This looks like it has races on rollover, and correct code would look like mips: return only 32 bits in get_cyclecount(), and try to avoid rollover in the cpu ticker. powerpc does this backwards: it returns 64 bits with races in get_cyclecount(), but returns only 32 bits without races for the cpu ticker. The races are not too bad because using get_cyclecount() is a monotonic timestamp is abuse. Differently bad swizzling breaks monotonicity differently badly too. - riscv: stub that always returns 1. That is really differently bad. This should at least copy the generic i386 implementation. riscv doesn't set a special cpu ticker either. It gets the default one which is a timecounter, just like on i386 without a TSC. Thus might be a stub too, but then more things would break so it would get fixed. - sparc64: like amd64 (?). I486_CPU also used to give an optimized bzero() for i486. That cost a branch and/or call through a function point for newer CPUs. This was removed in 2010 (r209460). >>> Some Via CPU is like i486 (by instruction set). >>> >>> CPU: VIA Ezra (800.04-MHz 686-class CPU) >>> Origin="CentaurHauls" Id=0x678 Family=0x6 Model=0x7 Stepping=8 >>> Features=0x803035 >>> AMD Features=0x80000000<3DNow!> >>> >> >> 486 never had MMX extensions. >> This is a 686, performance should improve ~4%. > > How did you measure the improvement? Keeping I486_CPU doesn't really > do anything except remove a some #ifdef'd conditionals in identcpu.c > and initcpu.c. It doesn't affect whether we use the TSC, MMX, etc. Those > are all runtime checks based the CPU feature flags from cpuid. The old optimization might be worth as much as 0.004%. I think all the removal does on a plain 486 is move a runtime test so that 486 CPUs no longer pass it, causing a panic for an "unsupported" CPU that would work except the panic, and prevent reaching a single 486 optimization. in bocpy(). The special support in initcpu() seems to be only to work around bugs in incompatible 486 variants. Fixes in this area should go the other way and remove the panics and the cpu class stuff to implement them. The cpu class stuff used to mainly give panics for newer CPUs that are backwards compatible but unknown. Feature tests prevent that happening. All new x86 are 686s with optional features. Plain 486 should work if it were classified as a 686 without any features. Bruce