From owner-svn-src-all@freebsd.org Wed Sep 14 20:10:01 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 072A2BDA5B2; Wed, 14 Sep 2016 20:10:01 +0000 (UTC) (envelope-from manu@bidouilliste.com) Received: from mail.blih.net (mail.blih.net [212.83.177.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.blih.net", Issuer "mail.blih.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 275EF1F89; Wed, 14 Sep 2016 20:09:59 +0000 (UTC) (envelope-from manu@bidouilliste.com) Received: from mail.blih.net (mail.blih.net [212.83.177.182]) by mail.blih.net (OpenSMTPD) with ESMTP id 81334550; Wed, 14 Sep 2016 22:09:50 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=bidouilliste.com; h=date :from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=mail; bh=p2w31a+LjzY7HQcRYyWnfAikEG0=; b=LCefPu3S4FGLHoFjit/jnfNVRo6d HAjsSz5KYxkCTDmEJg24ywGdOHbg7QpbQhcgP2CGucY3zkg/miMN/yvZsaRV49/m E1kzMESW/WK4FLzfwhKiGQRVqw7shQeb4moozGClYoPzs8h2C9yZ2hw26YE/4tiy Z7PNsp+BEX38WuA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=bidouilliste.com; h=date :from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; q=dns; s= mail; b=GqfNYLH/RZZPcAvu6gu1EdJm90oRMqNpT6cQeg/uKD0/+ByNhOSvwqZB tJAiXnLLO4r+qsYgsjN782tTrH306jl0R40Xu2zsOakb9/Gl6OkW723ypqBx6vDG noxuCneukCVNkN8rsYRyAjnstlNrAeD7X4TKWYivE001xn7UjOg= Received: from knuckles.blih.net (ip-54.net-82-216-203.roubaix.rev.numericable.fr [82.216.203.54]) by mail.blih.net (OpenSMTPD) with ESMTPSA id 107ae606 TLS version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO; Wed, 14 Sep 2016 22:09:50 +0200 (CEST) Date: Wed, 14 Sep 2016 22:09:49 +0200 From: Emmanuel Vadot To: Bruce Evans Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r305814 - head/sys/boot/common Message-Id: <20160914220949.bfbd8562ec180ab690b84b60@bidouilliste.com> In-Reply-To: <20160915042715.N2557@besplex.bde.org> References: <201609141743.u8EHhX7R038840@repo.freebsd.org> <20160915042715.N2557@besplex.bde.org> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.29; amd64-portbld-freebsd12.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Sep 2016 20:10:01 -0000 Hi Bruce, On Thu, 15 Sep 2016 05:37:19 +1000 (EST) Bruce Evans wrote: > On Wed, 14 Sep 2016, Emmanuel Vadot wrote: > > > Log: > > ufsread: Do not cast struct direct from void * > > This cause alignment problem on ARM (and possibly other archs), instead copy it. > > > > MFC after: 1 week > > > > Modified: > > head/sys/boot/common/ufsread.c > > > > Modified: head/sys/boot/common/ufsread.c > > ============================================================================== > > --- head/sys/boot/common/ufsread.c Wed Sep 14 16:47:17 2016 (r305813) > > +++ head/sys/boot/common/ufsread.c Wed Sep 14 17:43:32 2016 (r305814) > > @@ -97,21 +97,21 @@ static __inline uint8_t > > fsfind(const char *name, ufs_ino_t * ino) > > { > > static char buf[DEV_BSIZE]; > > - struct direct *d; > > + static struct direct d; > > char *s; > > ssize_t n; > > This looks like a good pessimization for space. boot2 on i386 has to > fit in 8192 bytes and has a negative number to spare (features are > already left out). Do you have any suggestion on making the code better ? This was the last patch for having EFI working on ARMv6 and this is something that I want to be enabled by default at some point. > > With auto buffers, and also builtin memcpy, the compiler can optimize > the memcpy to nothing on arches with no alignment, and should do this > with -Os. I think -ffreestanding in boot programs kills the builtin, > and this is almost intentional since compilers have poor support for > -Os so the inline memcpy is sometimes larger unless it is null. > > > > > fs_off = 0; > > while ((n = fsread(*ino, buf, DEV_BSIZE)) > 0) > > for (s = buf; s < buf + DEV_BSIZE;) { > > - d = (void *)s; > > + memcpy(&d, s, sizeof(struct direct)); > > if (ls) > > - printf("%s ", d->d_name); > > - else if (!strcmp(name, d->d_name)) { > > - *ino = d->d_ino; > > - return d->d_type; > > + printf("%s ", d.d_name); > > + else if (!strcmp(name, d.d_name)) { > > + *ino = d.d_ino; > > + return d.d_type; > > } > > - s += d->d_reclen; > > + s += d.d_reclen; > > } > > if (n != -1 && ls) > > printf("\n"); > > The static buffer in the old code also looks like a pessimization for > space. It doesn't seem to be needed to preserver or return results. > Compilers don't seem to be smart enough to see this and optimize it > to auto or vice versa. > > Testing shows that the static buffer is space optimization for gcc > and a space pessimization for clang: > > X #include > X #include > X > X int > X foo(void) > X { > X static char buf[101]; > X static int buf1[3]; > X > X read(0, buf, sizeof(buf)); > X memcpy(buf1, &buf[2], sizeof(buf1)); > X return (buf[0] + buf1[0]); > X } > > 2 static buffers give the best pessimizations for clang -Os -m32. > Even with -Os, clang prefers to use large instructions to reduce the > number of instructions. The static buffers give largest instructions, > and also prevent optimizing away the memcpy(). With auto buf1, it > takes -ffreestanding to kill the optimization of memcpy() and with > that optimization the size is reduced by almost a factor of 2. With > auto buf too, other pessimizations are applied to give a size reduction > of just 1 byte. gcc-4.2.1 never inlines the memcpy(), but it does > the other pessimizations less well to give an in-between size. > > Aproximate sizes on i386: 65 bytes for clang with statics, 49 for gcc, > 33 for clang with autos and builtins, 28 for tweaked output for clang > with static buf, auto buf1 and builtins (null memcpy). > > Probably the static buf is an intentional optimization for gcc. > Compilers do too good a job of generating large code for large stack > offsets. In my test code, the offsets are about 101 which is less than > 128 so it takes only 2 bytes. DEV_BSIZE is 512 so it needs 5-byte > offsets from the usual pessimizations for the base pointer (these > are to point %ebp at the top of the variables and put all the scalar > variables below the buffer so that the offsets are largest). But > clang also does good pessimizations for offsets from the static buffer: > > pushl $foo.buf # 5 bytes > movsbl foo.buf, %eax # 7 bytes > addl foo.buf+2, %eax # 6 bytes > > Actually optimizing for space: > > movl $foo.buf,%ebx # 5 bytes > pushl %ebx # 1 byte > movsbl (%ebx), %eax # 3 bytes > addl 2(%ebx), %eax # 3 bytes > > Bruce Cheers, -- Emmanuel Vadot