Date: Thu, 08 Nov 2012 12:13:14 +0100 From: Andre Oppermann <andre@freebsd.org> To: pyunyh@gmail.com Cc: FreeBSD Net <freebsd-net@freebsd.org>, Adrian Chadd <adrian@freebsd.org>, Pyun YongHyeon <yongari@freebsd.org> Subject: Re: svn commit: r242739 - stable/9/sys/dev/ti Message-ID: <509B93CA.90609@freebsd.org> In-Reply-To: <20121108023858.GA3127@michelle.cdnetworks.com> References: <201211080206.qA826RiN054539@svn.freebsd.org> <CAJ-VmomEOPGbLwmOmL0EdenZA7QKbV5P-hAYsTRcwLao2LbAqg@mail.gmail.com> <20121108023858.GA3127@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 08.11.2012 03:38, YongHyeon PYUN wrote: > On Wed, Nov 07, 2012 at 06:15:30PM -0800, Adrian Chadd wrote: >> If so, may I suggest we perhaps accelerate discussing if_transmit() of >> multiple frames per call? > > Hmm, actually I'm still not a fan of if_transmit() at this moment. > Honestly I don't have good queuing code in driver to handle queue > full condition. Interactions with altq(9) is also one of my > concern as well as packet reordering issue of drbr(9) interface. The whole interface packet handoff needs some serious reconsideration. These days we have two queues/buffers at the interfaces. One is the DMA ring which can take a considerable number of packets and the second one is the ifq (if enabled). The DMA ring already adds significant depth and latency so that a packet scheduler like altq(9) become almost useless. Also modern queue management algorithms like CoDel don't work with the current framework. Also bufferbloat is a major concern. See ACM queue article by Jim Gettys. What we need to take make this functionality available again is a well specified and reasonably simple interface handoff. It should include information on the maximum tx DMA ring depth and the current depth. There should also be a function to limit the current depth to a certain value. What I'd like to see is this (names are not fixed): if_send() as the main entry point for the stack. It's a function pointer within struct ifnet. In normal operation it is the same as if_transmit() and directly adds a packet to the tx DMA ring. Locking of the DMA ring is done in this function and a property of the driver. The stack always calls unlocked. Obviously the tx DMA ring lock must not be a sleep lock. When altq(9) or equivalent is active this function pointer is replaced with a call to the alternative queuing function that does it's magic. Again locking of the queuing mechanism is the property of that mechanism. When a NIC has multiple queues that it can bind to CPU's locking may not be necessary. We gain this flexibility in the driver to do that. if_transmit() is a function pointer for a function that directly adds a packet to the tx DMA ring (if a free slot is available). It is never called by the stack directly except in special circumstances. The altq(9), if active, uses if_transmit() to add packets to the tx DMA ring. If not active, if_send() is this function pointer. if_txeof() is a function pointer for a callback from the driver to an altq(9) dequeue function, if active. It is called when when new free slots on the tx DMA ring are available. When a driver needs a software interface queue because the tx DMA is too small, then the stack should provide a generic queuing implementation the driver can use. I've begun to explore this area while hacking on bge(4) and em(4) in tcp_workqueue branch. It's a very interesting path and we going to have a couple more discussions before we arrive at the optimal solution. :) -- Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?509B93CA.90609>