From owner-svn-src-all@freebsd.org  Fri Feb 15 14:04:19 2019
Return-Path: <owner-svn-src-all@freebsd.org>
Delivered-To: svn-src-all@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3E9F514DC4DA;
 Fri, 15 Feb 2019 14:04:19 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 553F46BA7F;
 Fri, 15 Feb 2019 14:04:18 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x1FE49SL083824
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Fri, 15 Feb 2019 16:04:13 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x1FE49SL083824
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id x1FE49ZY083823;
 Fri, 15 Feb 2019 16:04:09 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Fri, 15 Feb 2019 16:04:09 +0200
From: Konstantin Belousov <kostikbel@gmail.com>
To: Bruce Evans <brde@optusnet.com.au>
Cc: Alexey Dokuchaev <danfe@freebsd.org>, src-committers@freebsd.org,
 svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject: Re: svn commit: r344118 - head/sys/i386/include
Message-ID: <20190215140409.GQ24863@kib.kiev.ua>
References: <201902141353.x1EDrB0Z076223@repo.freebsd.org>
 <20190215071604.GA89653@FreeBSD.org>
 <20190215103644.GN24863@kib.kiev.ua>
 <20190215233444.F2229@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20190215233444.F2229@besplex.bde.org>
User-Agent: Mutt/1.11.2 (2019-01-07)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM,
 NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Feb 2019 14:04:19 -0000

On Sat, Feb 16, 2019 at 12:27:16AM +1100, Bruce Evans wrote:
> On Fri, 15 Feb 2019, Konstantin Belousov wrote:
> 
> > On Fri, Feb 15, 2019 at 07:16:04AM +0000, Alexey Dokuchaev wrote:
> >> On Thu, Feb 14, 2019 at 01:53:11PM +0000, Konstantin Belousov wrote:
> >>> New Revision: 344118
> >>> URL: https://svnweb.freebsd.org/changeset/base/344118
> >>>
> >>> Log:
> >>>   Provide userspace versions of do_cpuid() and cpuid_count() on i386.
> >>>
> >>>   Some older compilers, when generating PIC code, cannot handle inline
> >>>   asm that clobbers %ebx (because %ebx is used as the GOT offset
> >>>   register).  Userspace versions avoid clobbering %ebx by saving it to
> >>>   stack before executing the CPUID instruction.
> >>>
> >>> ...
> >>> +static __inline void
> >>> +do_cpuid(u_int ax, u_int *p)
> >>> +{
> >>> +	__asm __volatile(
> >>> +	    "pushl\t%%ebx\n\t"
> >>> +	    "cpuid\n\t"
> >>> +	    "movl\t%%ebx,%1\n\t"
> >>> +	    "popl\t%%ebx"
> >>
> >> Is there a reason to prefer pushl+movl+popl instead of movl+xchgl?
> >>
> >>     "movl %%ebx, %1\n\t"
> >>     "cpuid\n\t"
> >>     "xchgl %%ebx, %1"
> >
> > xchgl seems to be slower even in registers format (where no implicit
> > lock is used).  If you can demonstrate that your fragment is better in
> > some microbenchmark, I can change it.  But also note that its use is not
> > on the critical path.
> 
> The should have the same speed on modern x86.  xchgl %reg1,%reg2 is
> not slow, but it changes 2 visible registers and a needs somwhere to
> hold one of the registers while changing it, so on 14 year old AthlonXP
> where I know the times in cycles better, register xchgl was twice as slow
> as register move (2 cycles latency instead of 1, and throughput ==
> latency (?)).  On 2015 Haswell, register movl in a loop is in parallel
> with the loop overhead (1 cycle), while xchgl and pushl/popl take 0.5
> cycles longer on average.  Latency might be a problem for pushl/popl
> in critical paths.  There aren't many of those.

I think on modern Intels xchgl is implemented by renaming.  Still it is slower
than typically highly optimized push/pops.

That said, what is your preference ? My version or xchgl ?
My own preference is to leave it as is, since it is slightly slower,
and I do not want to spend several hours again, re-testing libc changes.