Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 15 May 2011 12:50:04 +0200
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r221913 - head/sys/dev/mii
Message-ID:  <20110515105004.GG92688@alchemy.franken.de>
In-Reply-To: <20110515135737.B825@besplex.bde.org>
References:  <201105142031.p4EKV5el014531@svn.freebsd.org> <20110515135737.B825@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, May 15, 2011 at 02:35:37PM +1000, Bruce Evans wrote:
> On Sat, 14 May 2011, Marius Strobl wrote:
> 
> >Log:
> > - There's no need for nibbletab to be static, it's const however.
> > - Fix whitespace.
> 
> There is every reason for it to be static.
> 
> >Modified: head/sys/dev/mii/mii.c
> >==============================================================================
> >--- head/sys/dev/mii/mii.c	Sat May 14 19:36:12 2011	(r221912)
> >+++ head/sys/dev/mii/mii.c	Sat May 14 20:31:04 2011	(r221913)
> >@@ -552,7 +552,7 @@ mii_down(struct mii_data *mii)
> >static unsigned char
> >mii_bitreverse(unsigned char x)
> >{
> >-	static unsigned char nibbletab[16] = {
> >+	unsigned const char const nibbletab[16] = {
> >		0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15
> >	};
> 
> Making it non-static asks the compiler to pessimize it by initializing
> it on every entry to the function.  Some versions of gcc do a very good

This function is mostly used during device probing (and the result of
the caller cached afterwards) and not in the hot path, thus there's no
need to tell the compiler that it _must_ keep a static copy of the array.

> job with this pessimization.  The normal pessimzation is to keep a
> static copy of the array anyway, and copy this to the stack on every
> entry to the function.  However, the pessimization seems to be broken
> by gcc-4.3.1 in -current.  gcc-3.3.3 on i386 gives:
> 
> %%%
> $ cat z.c
> int
> foo(void)
> {
> 	const unsigned char foo[16] = { 1, 2, 3, 4, };
> 
> 	test(foo[0], foo[1]);
> }
> $ gcc-3.3.3 -O2 -S z.c
> $ cat z.S
> 	.file	"z.c"
> 	.text
> 	.p2align 2,,3
> .globl foo
> 	.type	foo, @function
> foo:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	pushl	%edi
> 	subl	$28, %esp
> 	leal	-24(%ebp), %edi
> 	cld
> 	xorl	%eax, %eax
> 	movl	$4, %ecx
> 	rep
> 	stosl
> 	pushl	$2
> 	pushl	$1
> 	movb	$1, -24(%ebp)
> 	movb	$2, -23(%ebp)
> 	movb	$3, -22(%ebp)
> 	movb	$4, -21(%ebp)
> 	call	test
> 	movl	-4(%ebp), %edi
> 	leave
> 	ret
> 	.size	foo, .-foo
> 	.ident	"GCC: (GNU) 3.3.3 [FreeBSD] 20031106"
> %%%
> 
> For this small array, gcc does an even better pessimization for space
> than copying from a static copy.  It constructs the array entirely on
> the stack.  First it bzeros 16 bytes on the stack using an inefficient
> string instruction (so that the unused top 12 bytes of the array are
> initialized).  Then uses individual stores to initialize the low 4
> bytes of the array.  Now it knows that individual stores are faster
> than string instructions for small data.  The low 4 bytes are unused
> too.  The top 2 of these bytes are entirely unused, and the low 2 bytes
> are optimized to constants at compile time.
> 
> Declaring the array as static breaks the pessimization for gcc-3.3.3
> fairly well.  The code is then "pushl $2; pushl $1; call test".  The
> breakage is not quite complete -- data for the unused static array is
> still generated.
> 
> Using gcc-4.2.1 breaks the pessimization completely.  The code is then
> "pushl $2; pushl $1; call test", and data for the unused array is not
> generated, irrespective of whether the array is declared static or const.
> 
> Using -O instead of -O2 makes little difference.  Using -O0 fixes the
> pessimization for gcc-4.2.1 and gives the minor additional pessimization
> of loading the function parameters from the array.
> 
> Changing the array size from 16 to 1024 gives the normal pessimization
> for gcc-3.3.3 of memcpy()'ing the array from a static copy.  The code is
> then much more readable than the above, since it is not fully pessimized
> for space.
> 
> This pessimization is not common in the kernel.  K&R compilers have
> the feature that initialization of auto arrays is a syntax error.  Thus
> old code doesn't even have nighmares about this pessimization.
> 
> I noticed this pessimization mainly in the old ata driver.  That has
> lots of smallish arrays of data, and sqillions of function calls hidden
> in access macros.  Stepping through the generated code using ddb was
> very painful since what should be single memory accesses or constant
> loads was a function call or three for the memcpy()'s and access
> functions.
> 

I agree that such code shoudn't be used in the hot path.

Marius




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110515105004.GG92688>