Date: Tue, 24 Feb 2015 11:59:02 -0700 From: Ian Lepore <ian@freebsd.org> To: John-Mark Gurney <jmg@funkthat.com> Cc: Zbigniew Bodek <zbb@freebsd.org>, svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r279236 - head/sys/netinet Message-ID: <1424804342.3293.9.camel@freebsd.org> In-Reply-To: <20150224173413.GF46794@funkthat.com> References: <201502241257.t1OCv40V097418@svn.freebsd.org> <20150224173413.GF46794@funkthat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2015-02-24 at 09:34 -0800, John-Mark Gurney wrote: > Zbigniew Bodek wrote this message on Tue, Feb 24, 2015 at 12:57 +0000: > > Author: zbb > > Date: Tue Feb 24 12:57:03 2015 > > New Revision: 279236 > > URL: https://svnweb.freebsd.org/changeset/base/279236 > > > > Log: > > Change struct attribute to avoid aligned operations mismatch > > > > Previous __alignment(4) allowed compiler to assume that operations are > > performed on aligned region. On ARM processor, this led to alignment fault > > as shown below: > > trapframe: 0xda9e5b10 > > FSR=00000001, FAR=a67b680e, spsr=60000113 > > r0 =00000000, r1 =00000068, r2 =0000007c, r3 =00000000 > > r4 =a67b6826, r5 =a67b680e, r6 =00000014, r7 =00000068 > > r8 =00000068, r9 =da9e5bd0, r10=00000011, r11=da9e5c10 > > r12=da9e5be0, ssp=da9e5b60, slr=a054f164, pc =a054f2cc > > <...> > > udp_input+0x264: ldmia r5, {r0-r3, r6} > > udp_input+0x268: stmia r12, {r0-r3, r6} > > > > This was due to instructions which do not support unaligned access, > > whereas for __alignment(2) compiler replaced ldmia/stmia with some > > logically equivalent memcpy operations. > > In fact, the assumption that 'struct ip' is always 4-byte aligned > > is definitely false, as we have no impact on data alignment of packet > > stream received. > > So, the whole point of ETHER_ALIGN is to make struct ip aligned on > 4 byte offsets... This will probably impact performance on arm for > properly aligned struct ip... > ETHER_ALIGN is wonderful... if you're on a platform that can DMA to an arbitrary boundary. Of course, if you're on such a platform it can probably just access the word-sized values on halfword boundaries anyway. For arm, the only solution at the driver level is to memcpy() every incoming packet to another buffer to realign it. If you think that makes receive performance really bad, you'd be right. Many arm platforms can only DMA on a cacheline boundary. The size of an mbuf header is like 24 or 28 or something, definitely not cache aligned. So in addition to the extra copying the drivers do for ETHER_ALIGN, there could also be bounce-buffer copying involved due to the alignment in the busdma tag. The latter issue could be fixed with an MD padding field at the end of the mbuf header to make the data portion start on a cache line boundary. When I experimented with that concept on imx6 I gained 10 MB/sec performance from the reduced copying. -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1424804342.3293.9.camel>