Date: Tue, 1 Aug 2006 09:47:53 -0400 (EDT) From: Andrew Gallatin <gallatin@cs.duke.edu> To: Robert Watson <rwatson@FreeBSD.org> Cc: arch@FreeBSD.org, net@FreeBSD.org Subject: Re: Changes in the network interface queueing handoff model Message-ID: <17615.23433.918293.466584@grasshopper.cs.duke.edu> In-Reply-To: <20060801142558.M64452@fledge.watson.org> References: <20060730141642.D16341@fledge.watson.org> <17615.18793.700752.342809@grasshopper.cs.duke.edu> <20060801142558.M64452@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Robert Watson writes: > > On Tue, 1 Aug 2006, Andrew Gallatin wrote: > > > > - The ifnet send queue is a separately locked object from the device driver, > > > meaning that for a single enqueue/dequeue pair, we pay an extra four lock > > > operations (two for insert, two for remove) per packet. > > > > Going forward, especially now that we support sun4v CoolThreads hardware, > > we're going to want to rethink the "single lock" per transmit routine model > > that most drivers have. The most expensive operation in transmit routines > > is bus_dmamap_load_mbuf_sg(), especially when there is an IOMMU involved > > (like on CoolThreads machines) and there is no reason why this needs to be > > called with a driver's transmit lock held. I have hard data (from Solaris) > > about how much fine grained locking in a 10GbE driver's transmit routine > > helps. > > Right now, with the exception of locking for the ifnet dispatch queue, I > believe our ifnet API pretty much leaves decisions about the nature and > granularity of synchronization to the device driver author. The ifnet queue > is high on my list to address (hence this thread) -- are there any other parts > of our device driver framework that stand in the way from a device driver > being modified to support greater parallelism in sending? No, not that is directly related to ethernet drivers. However, busdma is a pain. Specifically, I hate that bus_dmamap_load_mbuf_sg() requires a bus_dmamap_t. That means that any fine-grained driver will need to "allocate" a bus_dmamap_t either via bus_dmamap_create(), or by pulling a pre-allocated bus_dmamap_t from a pre-allocated pool. Either will require a lock. Solaris has a similar problem, and I use the pool approach in my Solaris driver. Linux's pci_map_single()/pci_unmap_addr_set()/pci_unmap_len_set() is just so much nicer to use... Drew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17615.23433.918293.466584>