Date: Mon, 8 Aug 2016 12:21:19 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Bruce Simpson <bms@fastmail.net> Cc: Peter Jeremy <peter@rulingia.com>, Hans Petter Selasky <hps@selasky.org>, Adrian Chadd <adrian@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r303811 - in head/sys: net net80211 Message-ID: <20160808114333.Y957@besplex.bde.org> In-Reply-To: <32531e89-85fa-6012-6a3e-1e6a42843dc0@fastmail.net> References: <201608070348.u773mXXt030939@repo.freebsd.org> <46183559-343f-401b-6471-3822e3383a50@selasky.org> <20160807192333.GA79784@server.rulingia.com> <32531e89-85fa-6012-6a3e-1e6a42843dc0@fastmail.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 8 Aug 2016, Bruce Simpson wrote: > On 07/08/16 20:23, Peter Jeremy wrote: >>> On 08/07/16 05:48, Adrian Chadd wrote: >>>> +#define ETHER_IS_BROADCAST(addr) \ > ... >> IMHO, Adrian's code is clearer and micro-optimisations like this belong >> in the complier, not the code. Both are unclear micro-optimizations. >> If you want to make a few more assumptions (64-bit unaligned little-endian >> architecture with at least 2 accessible bytes beyond the address), the >> following has the added advantage of only referencing (addr) once. (I'm >> not suggesting it as a replacement though). >> >> #define ETHER_IS_BROADCAST(addr) \ >> ((*(const long *)(addr) & 0x00ffffffl) =3D=3D 0x00ffffffl) That is only 24 bits :-). Machines with uint48_t and data with good alignment allow the nicest wat to express this and probably to execute it: #define ETHER_IS_BROADCAST(addr) \ (*(uint48_t *)(addr) == 0xFFFFFFFFFFFF) Otherwise, we have to use uintmax_t and worry about padding as well as alignment, and pessimizations from it being very large. > *cough* *cough* 2007 wants its patch back. > > https://people.freebsd.org/~bms/dump/old/latest-8021p.diff > > The whole point of using bcmp() was to allow it to be inlined by the compiler > for the target arch, rather than forcing it to play guessing games with > (lowest-common-denominator machine-word-size) integer accesses. The compiler can't possibly inline bcmp() since it is nonstandard and we don't implement it as a special nonstandard inline. The compiler can't possibly inline memcmp() since it is nonstandard in in -ffreestanding mode and we don't implement it as a special nonstandard inline. __builtin_memcmp() might work. The data would need to be in a header so that it can be replaced by the immediate constant 0xffff* if that is better. But FreeBSD doesn't bother with optimizations like this. It uses memcpy() instead of bcopy() in some places to get the builtin memcpy(), but this was broken and turned into just a style bug long ago by -freestanding. -ffreestanding turns off almost all builtins since most builtins are for standard functions. FreeBSD doesn't bother to replace even memcpy() by __builtin_memcpy() so that it works as intended again. I do this in some versions, but it makes little difference. Perhaps doing this for hundreds of functions would make a difference of more like 1% than 0.01% (on average, or maybe 10% for a micro-benchmark). Someone pessimized struct ethernet by declaring it as __packed (without declaring it as __aligned). It must be packed, and probably ends up misaligned in some cases. Thus declaration ensures that the compiler treats it as misaligned in all cases unless it can prove otherwise. On x86, the compiler can just generate possibly-misaligned word-sized accesses. On other arches, it might have to generate 6 1-byte loads and combine the results, as expressed in the unobfuscated comparison. > ETHER_IS_MULTICAST(), by contrast, only needs to check a single bit in the > 48-bit MAC. Also blame the ethernet designers for bad data formats :-). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160808114333.Y957>