Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 8 Aug 2016 12:21:19 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Bruce Simpson <bms@fastmail.net>
Cc:        Peter Jeremy <peter@rulingia.com>, Hans Petter Selasky <hps@selasky.org>,  Adrian Chadd <adrian@freebsd.org>, src-committers@freebsd.org,  svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r303811 - in head/sys: net net80211
Message-ID:  <20160808114333.Y957@besplex.bde.org>
In-Reply-To: <32531e89-85fa-6012-6a3e-1e6a42843dc0@fastmail.net>
References:  <201608070348.u773mXXt030939@repo.freebsd.org> <46183559-343f-401b-6471-3822e3383a50@selasky.org> <20160807192333.GA79784@server.rulingia.com> <32531e89-85fa-6012-6a3e-1e6a42843dc0@fastmail.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 8 Aug 2016, Bruce Simpson wrote:

> On 07/08/16 20:23, Peter Jeremy wrote:
>>> On 08/07/16 05:48, Adrian Chadd wrote:
>>>> +#define	ETHER_IS_BROADCAST(addr) \
> ...
>> IMHO, Adrian's code is clearer and micro-optimisations like this belong
>> in the complier, not the code.

Both are unclear micro-optimizations.

>> If you want to make a few more assumptions (64-bit unaligned little-endian
>> architecture with at least 2 accessible bytes beyond the address), the
>> following has the added advantage of only referencing (addr) once.  (I'm
>> not suggesting it as a replacement though).
>> 
>> #define	ETHER_IS_BROADCAST(addr) \
>> 	((*(const long *)(addr) & 0x00ffffffl) =3D=3D 0x00ffffffl)

That is only 24 bits :-).

Machines with uint48_t and data with good alignment allow the nicest
wat to express this and probably to execute it:

#define	ETHER_IS_BROADCAST(addr) \
 	(*(uint48_t *)(addr) == 0xFFFFFFFFFFFF)

Otherwise, we have to use uintmax_t and worry about padding as well
as alignment, and pessimizations from it being very large.

> *cough* *cough* 2007 wants its patch back.
>
> https://people.freebsd.org/~bms/dump/old/latest-8021p.diff
>
> The whole point of using bcmp() was to allow it to be inlined by the compiler 
> for the target arch, rather than forcing it to play guessing games with 
> (lowest-common-denominator machine-word-size) integer accesses.

The compiler can't possibly inline bcmp() since it is nonstandard and
we don't implement it as a special nonstandard inline.

The compiler can't possibly inline memcmp() since it is nonstandard in
in -ffreestanding mode and we don't implement it as a special nonstandard
inline.  __builtin_memcmp() might work.  The data would need to be in
a header so that it can be replaced by the immediate constant 0xffff*
if that is better.

But FreeBSD doesn't bother with optimizations like this.  It uses
memcpy() instead of bcopy() in some places to get the builtin memcpy(),
but this was broken and turned into just a style bug long ago by
-freestanding.  -ffreestanding turns off almost all builtins since
most builtins are for standard functions.  FreeBSD doesn't bother to
replace even memcpy() by __builtin_memcpy() so that it works as intended
again.  I do this in some versions, but it makes little difference.
Perhaps doing this for hundreds of functions would make a difference
of more like 1% than 0.01% (on average, or maybe 10% for a micro-benchmark).

Someone pessimized struct ethernet by declaring it as __packed (without
declaring it as __aligned).  It must be packed, and probably ends up
misaligned in some cases.  Thus declaration ensures that the compiler
treats it as misaligned in all cases unless it can prove otherwise.
On x86, the compiler can just generate possibly-misaligned word-sized
accesses.  On other arches, it might have to generate 6 1-byte loads
and combine the results, as expressed in the unobfuscated comparison.

> ETHER_IS_MULTICAST(), by contrast, only needs to check a single bit in the 
> 48-bit MAC.

Also blame the ethernet designers for bad data formats :-).

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160808114333.Y957>