From owner-svn-src-head@FreeBSD.ORG Wed Mar 25 20:15:30 2015 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1282C83E; Wed, 25 Mar 2015 20:15:30 +0000 (UTC) Received: from mail-wg0-x230.google.com (mail-wg0-x230.google.com [IPv6:2a00:1450:400c:c00::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9322DC83; Wed, 25 Mar 2015 20:15:29 +0000 (UTC) Received: by wgdm6 with SMTP id m6so40900024wgd.2; Wed, 25 Mar 2015 13:15:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=BgOwk2XQY3iZ6GEFotLMdYbmEgWMsELIC9uBuXvAIkg=; b=0Kb3ehhreeR+90EhBia1weGXi4BS+9iOwku8wW41J+Q90cMVcUOROsDngJrjFxMdmq eLvEXON0X8tOQ2t6kASbdPSVmhUoeM29D2wsHJfax71IpHR45lY/OIR/4Q77Y7cSsmC5 pk+wWEleedRQRoFm+0q2nO0W1u5StkydBg7IFIWGly9T6glv2s/AD9lGGLP11kluSVl/ C2i4CzXevVpg8TGWij/8Y0OugdvhRWgAIIP6m2xcBu3R9VCea3zRveBFzLeJuDsKHeIU iGZdhNYbNIq7iQwY1hA+/jCAZ7u6HMcN76UAXF3GInSYEj8mSp+b/8G5yV92Q3SYt8HC 9Qgw== X-Received: by 10.180.221.232 with SMTP id qh8mr40505608wic.19.1427314527993; Wed, 25 Mar 2015 13:15:27 -0700 (PDT) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by mx.google.com with ESMTPSA id j9sm5212461wjy.18.2015.03.25.13.15.26 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 25 Mar 2015 13:15:27 -0700 (PDT) Date: Wed, 25 Mar 2015 21:15:24 +0100 From: Mateusz Guzik To: Bruce Evans Subject: Re: svn commit: r280407 - head/sys/kern Message-ID: <20150325201524.GB14280@dft-labs.eu> References: <201503240010.t2O0ACZb089906@svn.freebsd.org> <20150324135350.N1665@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150324135350.N1665@besplex.bde.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2015 20:15:30 -0000 On Tue, Mar 24, 2015 at 03:58:14PM +1100, Bruce Evans wrote: > On Tue, 24 Mar 2015, Mateusz Guzik wrote: > > >Log: > > filedesc: microoptimize fget_unlocked by getting rid of fd < 0 branch > > This has no effect. Compilers optimize to the equivalent of the the > unsigned cast hack if this is good. On x86, it is good since no > instructions are needed for the conversion, and the only difference > between the code generated by > > if (fd >= fdt->fdt_nfiles) > > and > > if (fd < 0 || fd >= fdt->fdt_nfiles) > > is to change from jl (jump if less than) to jb (jump if below) (both > jumps over the if clause). Negative values satisfy jl but not jb. > I would not commit the change if it did not affect generated assembly at least on amd64. if (fd < 0 || fd >= fdt->fdt_nfiles): 0xffffffff807d147d : sub $0x38,%rsp 0xffffffff807d1481 : test %esi,%esi 0xffffffff807d1483 : js 0xffffffff807d15b8 0xffffffff807d1489 : mov (%rdi),%rbx 0xffffffff807d148c : cmp %esi,(%rbx) 0xffffffff807d148e : jle 0xffffffff807d15bf if ((u_int)fd >= fdt->fdt_nfiles): 0xffffffff807d147d : sub $0x38,%rsp 0xffffffff807d1481 : mov (%rdi),%rbx 0xffffffff807d1484 : cmp %esi,(%rbx) 0xffffffff807d1486 : jbe 0xffffffff807d15a8 I did not check other archs prior to the commit. This is clang 3.6 as present in head. Sources compiled with -O2. Also see below for other compiler test. > > Casting fd to an unsigned type simplifies fd range coparison to mere checking > > if the result is bigger than the table. > > No, it obfuscates the range comparison. > It is a standard hack which is hard to misread and which seems to add a slight benefit (see below). > On some arches, conversion to unsigned is slow. Then compilers should > optimize in the opposite direction by first undoing the bogus cast to > get back to the range check and then optimizing the range check using > the best strategy. Compilers should probably first undo the bogus > cast even on x86, so as to reduce to the range check case. Range checks > are more important and more uniform than bogus casts, so they are more > likely to be optimized. Similarly if fd has type u_int to begin with. It affects assembly on all arm, powerpc64 and mips64 as well. Both arm and powerpc just get rid of zero-test and use the same instructions to perform the other comparison. I only found a difference on mips64 which used sltu instead of slt (but still got rid of the zero-check). Granted I don't know mips instruction costs and I don't have the real hadrware to benchark on. Seems like a win for most architectures anyway. > > >Modified: head/sys/kern/kern_descrip.c > >============================================================================== > >--- head/sys/kern/kern_descrip.c Tue Mar 24 00:01:30 2015 (r280406) > >+++ head/sys/kern/kern_descrip.c Tue Mar 24 00:10:11 2015 (r280407) > >@@ -2342,7 +2342,7 @@ fget_unlocked(struct filedesc *fdp, int > >#endif > > > > fdt = fdp->fd_files; > >- if (fd < 0 || fd >= fdt->fdt_nfiles) > >+ if ((u_int)fd >= fdt->fdt_nfiles) > > return (EBADF); > > /* > > * Fetch the descriptor locklessly. We avoid fdrop() races by > [..] > - fget_locked() seems to be unchanged recently, so it doesn't have the > obfuscation. fget_unlocked() is newer than the unobfuscated version > of fget_locked(). It is in a different file, but may have copied the > unobfuscated range check from fget_locked(). This commit restores the > obfuscation to it alone. > This definitely should be synced one way or the other, thanks for pointing this out. > There are now some other range checks in kern_desc.c that are missing > the obfuscation: grepping for "fd <" gives: > - a magic treatment for negative fd's in closefrom() Well I'll start with a short note that I don't know what's up with uap->lowfd = 0; instead of retuning with EINVAL. Anyay, the cast there would not have any use. > - the range check in an unobfuscated by confusing form in fdisused(). > The check has to be inverted since it is in a KASSERT(), and the > inversion is done by reversing the inequalities. Correct, this likely should also be synced (one way or the other). > - similarly in fdalloc(). > Same. > I used to use this hack a lot 30 years ago, but stopped when compilers > got better 20-25 years ago. > I wrote a toy program and checked e.g. gcc5 and it still did not optimise zero-test away. In fact I would argue the optimisation in question is impossible unless upper limit check is against a constant in (0, INT_MAX) range or against a var whose range is known at compile time. In particular this is problematic for negative values. Consider: int fd, limit; ............ if (fd < 0 || fd >= limit) Let's have fd = -5 and limit = -4. Should the fd < 0 check be dropped by the compiler and the expression turned into (u_int)fd >= limit, the coparison would be false thus changing the outcome. As such, I prefer to keep the cast. -- Mateusz Guzik