From owner-svn-src-head@freebsd.org  Sun Mar 19 15:46:35 2017
Return-Path: <owner-svn-src-head@freebsd.org>
Delivered-To: svn-src-head@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8BB1ED13735;
 Sun, 19 Mar 2017 15:46:35 +0000 (UTC)
 (envelope-from freebsd@pdx.rh.CN85.dnsmgr.net)
Received: from pdx.rh.CN85.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 520261696;
 Sun, 19 Mar 2017 15:46:34 +0000 (UTC)
 (envelope-from freebsd@pdx.rh.CN85.dnsmgr.net)
Received: from pdx.rh.CN85.dnsmgr.net (localhost [127.0.0.1])
 by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3) with ESMTP id v2JFkQ13060300;
 Sun, 19 Mar 2017 08:46:26 -0700 (PDT)
 (envelope-from freebsd@pdx.rh.CN85.dnsmgr.net)
Received: (from freebsd@localhost)
 by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3/Submit) id v2JFkQOh060299;
 Sun, 19 Mar 2017 08:46:26 -0700 (PDT) (envelope-from freebsd)
From: "Rodney W. Grimes" <freebsd@pdx.rh.CN85.dnsmgr.net>
Message-Id: <201703191546.v2JFkQOh060299@pdx.rh.CN85.dnsmgr.net>
Subject: Re: svn commit: r315522 - in head: contrib/binutils/ld/emulparams
 sys/conf
In-Reply-To: <20170319123107.W994@besplex.bde.org>
To: Bruce Evans <brde@optusnet.com.au>
Date: Sun, 19 Mar 2017 08:46:26 -0700 (PDT)
CC: Ed Maste <emaste@freebsd.org>, src-committers@freebsd.org,
 svn-src-all@freebsd.org, svn-src-head@freebsd.org
Reply-To: rgrimes@freebsd.org
X-Mailer: ELM [version 2.4ME+ PL121h (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
X-BeenThere: svn-src-head@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: SVN commit messages for the src tree for head/-current
 <svn-src-head.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-head>,
 <mailto:svn-src-head-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-head/>
List-Post: <mailto:svn-src-head@freebsd.org>
List-Help: <mailto:svn-src-head-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-head>,
 <mailto:svn-src-head-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Mar 2017 15:46:35 -0000

> On Sun, 19 Mar 2017, Ed Maste wrote:
> 
> > Log:
> >  use INT3 instead of NOP for x86 binary padding
> >
> >  We should never end up executing the inter-function padding, so we
> >  are better off faulting than silently carrying on to whatever function
> >  happens to be next.
> >
> >  Note that LLD will soon do this by default (although it currently pads
> >  with zeros).
> >
> >  Reviewed by:	dim, kib
> >  MFC after:	1 month
> >  Sponsored by:	The FreeBSD Foundation
> >  Differential Revision:	https://reviews.freebsd.org/D10047
> 
> Is this a pessimization?  Instruction prefetch near the end of almost
> every function now fetches INT3 instead of NOP.  Both have to be
> decoded to decoded whether to speculatively execute them.  INT3 is
> unlikely to be speculatively executed, but it takes extra work to
> decide not to do so.
> 
> Functions normally end with a RET or unconditional JMP, and then branch
> prediction usually prevents speculative execution beyond the end, so the
> pessimization must be small.
> 
> Intra-function padding that is executed now uses "fat NOP" instructions
> like null LEA's since this is faster to execute than a long string of
> NOPs.  This is less readable than NOPs or even INT3's.  Of course, INT3
> can't be used for executed padding.  I think it is also used for intra-
> function padding that is not executed.  This is just harder to read
> unless it is needed to avoid the possible pessimization in this commit.
> The intra-function code with nops might look like:
> 
>  		jmp	over
>  		nop
>  		# 7 nops altogether
>  		nop
>  	over:
> 
> or
> 
>  		jmp	over
>  		nullpad7	# single 7 byte null padding instruction
>  	over:
> 
> and it is likely to be CPU-dependent whether 7 possibly-speculatively
> executed nops take more or less resources than 1 possibly-speculatively
> executed fancy instruction.   I would expect the fancy instructions to
> take more resources each.
> 
> Fancy LEAs don't seem such a good choice for executed padding either.
> amd64 uses lots of REX prefixes instead of fancy instructions, since
> these are designed to have low overheads.  They certainly aren't
> executed separately.  On i386, the same technique with lots of older
> prefixes is not used much, probably because all prefixes have high
> overheads on old i386's.  They can be as slow as NOPs although they
> aren't executed separately.

As an intermediate ground what about using N of something really 
easy for the decoder/branch predictor to grovel over, then a single
int3 at the end of the block so if we do fall into this we end
up getting the desired effect?
nop's followed by an

> Bruce
-- 
Rod Grimes                                                 rgrimes@freebsd.org