Date: Mon, 7 Jul 2014 17:42:08 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Hans Petter Selasky <hps@selasky.org> Cc: freebsd-net@freebsd.org, freebsd-current@FreeBSD.org Subject: Re: [RFC] Allow m_dup() to use JUMBO clusters Message-ID: <1613382893.8316833.1404769328724.JavaMail.root@uoguelph.ca> In-Reply-To: <53BA5657.8010309@selasky.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hans Petter Selasky wrote: > Hi, > > I'm asking for some input on the attached m_dup() patch, so that > existing functionality or dependencies are not broken. The background > for the change is to allow m_dup() to defrag long mbuf chains that > doesn't fit into a specific hardware's scatter gather entries, > typically > when doing TSO. > > In my case the HW limit is 16 entries of length 4K for doing a > 64KByte > TSO packet. Currently m_dup() is at best producing 32 entries of each > 2K > for a 64Kbytes TSO packet. > > By allowing m_dup() to get JUMBO clusters when allocating mbufs, we > avoid creating a new function, specific to the hardware, to defrag > some > rare-occurring very long mbuf chains into a mbuf chain below 16 > entries. > > Any comments? > 1 - If you are using NFS with the default (64K) I/O size, then long mbuf chains of 35 entries aren't rare. They happen on every read reply/write request. 2 - When I changed NFS to use pagesize clusters for reads/writes I was able to get the system into a state where threads were persistently stuck on "btalloc". If I understand this correctly, the system was not able to allocate boundary tags because the kernel address space had been fragmented too much. --> As such, I never committed this patch to head and would caution against using pagesize clusters. I do not have a better solution at this point, but I do have an untested patch (I need to get access to some TSO enabled hardware to test it) that adds if_hw_tsomaxseg, which is a count of the maximum number of transmit segments (mbufs in chain) that a network device driver supports. I think that having the driver set if_hw_tsomaxseg == 16 is preferable to doing a copy of the data to pagesize clusters. (I'd also say that hardware that supports only 16 transmit segments for a TSO segment is not a good piece of hardware for FreeBSD.) rick > --HPS > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to > "freebsd-current-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1613382893.8316833.1404769328724.JavaMail.root>