Date: Sat, 04 Nov 2006 08:24:55 -0800 From: Sam Leffler <sam@errno.com> To: pyunyh@gmail.com Cc: hackers@freebsd.org, "Devon H. O'Dell" <devon.odell@gmail.com> Subject: Re: vr(4) performance Message-ID: <454CBED7.2000103@errno.com> In-Reply-To: <20061103003311.GD69214@cdnetworks.co.kr> References: <9ab217670611021511l3120d58bhd0b61bf44f8ecc87@mail.gmail.com> <454A7EF2.5090201@errno.com> <20061103003311.GD69214@cdnetworks.co.kr>
next in thread | previous in thread | raw e-mail | index | archive | help
Pyun YongHyeon wrote: > On Thu, Nov 02, 2006 at 03:27:46PM -0800, Sam Leffler wrote: > > Devon H. O'Dell wrote: > > > Hey all, > > > > > > So, vr(4) kind of sucks, and it seems like this is mostly due to the > > > fact that we call m_defrag() on every mbuf that we send through it. > > > This seems to really screw performance on outgoing packets (something > > > like 33% the output efficiency of fxp(4), if I'm understanding this > > > all correctly). > > > > > > I'm sort of wondering if anybody has attempted to address this before > > > and if there's a way to possibly mitigate this behavior. I know Bill > > > Paul's comments say ``Unfortunately, FreeBSD FreeBSD doesn't guarantee > > > that mbufs will be filled in starting at longword boundaries, so we > > > have to do a buffer copy before transmission.'' -- since it's been a > > > long day, and I'm about to go home to grab a pizza and stop thinking > > > about code, would anybody mind offering suggestions as to either: > > > > > > a) Pros and cons of guaranteeing that they're filled in aligned (and > > > possibly hints on doing it), or > > > b) Possible workarounds / hacks to do this faster for vr(4) > > > > > > Any input is appreciated! (Except ``vr(4) is lol'') > > > > m_defrag is ~10x slower than it needs to be. I proposed changes to > > address this a while back but eventually gave up and put driver-specific > > code in ath. You can look there or I can send you some patches to > > m_defrag to try out in vr. > > > > Because the purpose of m_defrag(9) in vr(4) is to guarantee longword > aligned mbufs I'm not sure ath_defrag can be used here. If memory > serve me right ath_defrag would not change the first mbuf address > in a chain. If the first mbuf is not aligned on longword boundary > it wouldn't work I guess. Of course we can check the first mbuf in > the chain before calling super-fast ath_defrag, I guess. > m_defrag is used for two purposes (mainly) in the system: reducing the mbuf count in a chain so that an outbound packet fits in a limited number of h/w tx descriptors and aligning packet data for cards with constrained dma engines. Both these operations belong in bus_dma. Combining both these operations in a single routine results in overly pessimistic code for the common case. Separately the algorithm in m_defrag is suboptimal (e.g. it makes a complete copy even when a packet needs no changes). ath_defrag is example code tailored to the ath driver that handles only the mbuf chain too long issue. I have other code that can do packet alignment and/or both alignment+mbuf coalescing far better than the current logic in m_defrag. The right solution to this problem--as suggested by John Baldwin and Scott Long is to improve the bus_dma code so these things happen automatically for the driver according to the dma tag config. This would eliminate the need for m_defrag in all cases I'm aware of. Since bus_dma has info like the max # segments a device can accept and any alignment constraints it can do a much more efficient job. Sam
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?454CBED7.2000103>