Date: Wed, 2 Aug 2006 20:01:16 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: John Polstra <jdp@polstra.com> Cc: arch@FreeBSD.org, Robert Watson <rwatson@FreeBSD.org>, net@FreeBSD.org Subject: RE: Changes in the network interface queueing handoff model Message-ID: <20060802184349.K90387@delplex.bde.org> In-Reply-To: <XFMail.20060731100533.jdp@polstra.com> References: <XFMail.20060731100533.jdp@polstra.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 31 Jul 2006, John Polstra wrote: > I question whether you need a fallback software if_snd queue at all > for modern devices such as the Intel and Broadcom gigabit chips. The > hardware transmit descriptor rings typically have sizes of the order > of 256 descriptors. I think if the ring fills up, you could simply > drop the packet with ENOBUFS. That's what happens if the if_snd queue > fills up, and its maximum size is comparable to the sizes of modern > descriptor rings. It would simplify things quite a bit to eliminate > the if_snd queue entirely for such devices. I use an if_snd queue length of about 5000 in my version of the sk driver to work around suckage in ENOBUFS handling. The hardware (*) tx ring size is 512, and tiny packets can be sent in 4 usec, so the hardware queue provides only 2 msec worth of buffering. select(2) for output on sockets doesn't work right, so there is no good way (**) for applications to proceed when a syscall returns ENOBUFS. An extra queue length of 500 provides an extra 20 msec worth of buffering which is usually enough when HZ = 100. (*) I think the sk tx ring is not really in hardware, so it can be much larger than 512, but a length of > 5000 for it seems excessive and caused panics when I tried it. (**) Various bad ways can be found in various versions of ttcp and tools/netrate. They involve either backing off by sleeping (which doesn't keep the tx active unless the sleep granularity is small (which only happens under FreeBSD if HZ is too large)), or by never backing off (which gives busy-waiting). Instead, select() on the output socket should actually work -- it should succeed if the tx queue length is below a low watermark. Apparently, select() on output sockets normally doesn't work, since no version of ttcp that I've looked at (not many) even tries this. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060802184349.K90387>