Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Feb 2009 14:05:37 -0800 (PST)
From:      Nate Eldredge <neldredge@math.ucsd.edu>
To:        Marcel Moolenaar <xcllnt@mac.com>
Cc:        freebsd-hackers@freebsd.org, Julian Stacey <jhs@berklix.org>, Jille Timmermans <jille@quis.cx>, Andrew Brampton <brampton+freebsd-hackers@gmail.com>
Subject:   Re: pahole - Finding holes in kernel structs
Message-ID:  <Pine.GSO.4.64.0902121356310.975@zeno.ucsd.edu>
In-Reply-To: <B9DCCF52-36A3-4331-B439-6CBF88158C44@mac.com>
References:  <200902121549.n1CFnLdt002361@fire.js.berklix.net> <49944F8F.5080104@quis.cx> <B9DCCF52-36A3-4331-B439-6CBF88158C44@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 12 Feb 2009, Marcel Moolenaar wrote:

>
> On Feb 12, 2009, at 8:34 AM, Jille Timmermans wrote:
>
>> Julian Stacey schreef:
>>>> 1) Is it worth my time trying to rearrange structs?
>>> I wondered whether as a sensitivity test, some version of gcc (or
>>> its competitor ?) might have capability to automatically re-order
>>> variables ?  but found nothing in man gcc "Optimization Options".
>> There is a __packed attribute, I don't know what it exactly does and 
>> whether it is an improvement.
>> 
>
> __packed is always a gross pessimization. The side-effect of
> packing a structure is that the alignment of the structure
> drops to 1. That means that any field will be read 1 byte at
> a time and reconstructed by logical operations.

The other alternative is to read/write that member by unaligned 
operations, on platforms that support it.  This also typically comes with 
a performance penalty, of course.  Usually it means the hardware reads the 
two words that overlap the member and pieces it back together.  But on 
such a platform the software does not need to handle it specially; it 
executes the same instruction, but it takes more time.

The only reason to use this would be (1) if you needed to have your 
structure occupy as little memory as possible; for instance, if your 
structure had two elements, one 'int' and one 'char', and you had 1 
billion of them, using __packed__ would save you 3 gigabytes.  Or (2) if 
you need to conform to an externally defined data structure that already 
does this.  Most places in the kernel, I don't think either of these would 
be true.

-- 

Nate Eldredge
neldredge@math.ucsd.edu



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.64.0902121356310.975>