From owner-svn-src-head@freebsd.org Fri Feb 15 13:27:29 2019 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AE01C14DB1AB; Fri, 15 Feb 2019 13:27:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 2BE7069FC3; Fri, 15 Feb 2019 13:27:28 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 6DAA2433538; Sat, 16 Feb 2019 00:27:18 +1100 (AEDT) Date: Sat, 16 Feb 2019 00:27:16 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Alexey Dokuchaev , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r344118 - head/sys/i386/include In-Reply-To: <20190215103644.GN24863@kib.kiev.ua> Message-ID: <20190215233444.F2229@besplex.bde.org> References: <201902141353.x1EDrB0Z076223@repo.freebsd.org> <20190215071604.GA89653@FreeBSD.org> <20190215103644.GN24863@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=6I5d2MoRAAAA:8 a=Qyq6Zs7ZMgCplSLokTAA:9 a=CjuIK1q_8ugA:10 a=IjZwj45LgO3ly-622nXo:22 X-Rspamd-Queue-Id: 2BE7069FC3 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.94 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_SHORT(-0.94)[-0.942,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Feb 2019 13:27:30 -0000 On Fri, 15 Feb 2019, Konstantin Belousov wrote: > On Fri, Feb 15, 2019 at 07:16:04AM +0000, Alexey Dokuchaev wrote: >> On Thu, Feb 14, 2019 at 01:53:11PM +0000, Konstantin Belousov wrote: >>> New Revision: 344118 >>> URL: https://svnweb.freebsd.org/changeset/base/344118 >>> >>> Log: >>> Provide userspace versions of do_cpuid() and cpuid_count() on i386. >>> >>> Some older compilers, when generating PIC code, cannot handle inline >>> asm that clobbers %ebx (because %ebx is used as the GOT offset >>> register). Userspace versions avoid clobbering %ebx by saving it to >>> stack before executing the CPUID instruction. >>> >>> ... >>> +static __inline void >>> +do_cpuid(u_int ax, u_int *p) >>> +{ >>> + __asm __volatile( >>> + "pushl\t%%ebx\n\t" >>> + "cpuid\n\t" >>> + "movl\t%%ebx,%1\n\t" >>> + "popl\t%%ebx" >> >> Is there a reason to prefer pushl+movl+popl instead of movl+xchgl? >> >> "movl %%ebx, %1\n\t" >> "cpuid\n\t" >> "xchgl %%ebx, %1" > > xchgl seems to be slower even in registers format (where no implicit > lock is used). If you can demonstrate that your fragment is better in > some microbenchmark, I can change it. But also note that its use is not > on the critical path. The should have the same speed on modern x86. xchgl %reg1,%reg2 is not slow, but it changes 2 visible registers and a needs somwhere to hold one of the registers while changing it, so on 14 year old AthlonXP where I know the times in cycles better, register xchgl was twice as slow as register move (2 cycles latency instead of 1, and throughput == latency (?)). On 2015 Haswell, register movl in a loop is in parallel with the loop overhead (1 cycle), while xchgl and pushl/popl take 0.5 cycles longer on average. Latency might be a problem for pushl/popl in critical paths. There aren't many of those. There is no reason to use the style with strings made unreadable using soft tabs and newlines. gcc supported hard newlines 20-30 years ago, but broke this because C90 or C99 made hard newlines in strings invalid. This broke lots of my asms. I now use hard tabs and backslash-hard_newlines after soft newlines: __asm __volatile(" \n\ pushl %%ebx \n\ cpuid \n\ movl %%ebx,%1 \n\ popl %%ebx" \n\ "); The Standard C lossage forces use \n\ before hard newline, and readability forces a hard-to-edit variable number of hard tabs before \n\, but otherwise the code looks the same as before (opcodes are outdented to column 8 in large asms, and labels are outdented to column 0, so that the code looks the same as non-inline asm too). Bruce