Date: Mon, 13 Sep 2010 21:11:25 +0200 From: Andre Oppermann <andre@freebsd.org> To: pyunyh@gmail.com Cc: Tom Judge <tom@tomjudge.com>, freebsd-net@freebsd.org, davidch@broadcom.com, yongari@freebsd.org Subject: Re: bce(4) - com_no_buffers (Again) Message-ID: <4C8E775D.8070202@freebsd.org> In-Reply-To: <20100913184833.GF1229@michelle.cdnetworks.com> References: <4C894A76.5040200@tomjudge.com> <20100910002439.GO7203@michelle.cdnetworks.com> <4C8E3D79.6090102@tomjudge.com> <20100913184833.GF1229@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 13.09.2010 20:48, Pyun YongHyeon wrote: > On Mon, Sep 13, 2010 at 10:04:25AM -0500, Tom Judge wrote: >> Without BCE_JUMBO_HDRSPLIT then we see no errors. With it we see number >> of errors, however the rate seems to be reduced compaired to the >> previous version of the driver. >> > > It seems there are issues in header splitting and it was disabled > by default. Header splitting reduces packet processing overhead in > upper layer so it's normal to see better performance with header > splitting. I'm not sure that header splitting really helps much at least for TCP. The only place where it could make a difference is at socket buffer append time. There the header get 'thrown away'. With header splitting the first mbuf in the chain containing the header can be returned to the free pool. Without header splitting it's just a offset change in the mbuf. IIRC header splitting was introduced with the Tigeon cards which were the first programmable network cards and the first to support putting the header in a different mbuf. Header splitting, in theory, could make a difference with zero copy sockets where the data portion in a separate mbuf is flipped by VM magic into userspace. The trouble is that no driver fully supports the semantics required for page flipping and the zero copy code, if compiled in, is less much less optimized for the non-flipping case than the standard code path. With the many dozen gigabyte per second memory copy bandwidth of current CPU's it remains questionable whether the page-flipping VM magic is actually faster than a plain kernel/userspace copy as in the standard code path. I generally recommend not to use ZERO_COPY_SOCKETS. I suspect in the case of the bce(4) driver the change in header splitting is probably not the cause of the performance difference. -- Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C8E775D.8070202>