Date: Mon, 27 Jan 2014 18:47:10 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: John-Mark Gurney <jmg@funkthat.com> Cc: freebsd-net@freebsd.org, Adam McDougall <mcdouga9@egr.msu.edu> Subject: Re: Terrible NFS performance under 9.2-RELEASE? Message-ID: <222089865.17245782.1390866430479.JavaMail.root@uoguelph.ca> In-Reply-To: <20140127032338.GP13704@funkthat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
John-Mark Gurney wrote: > Rick Macklem wrote this message on Sun, Jan 26, 2014 at 21:16 -0500: > > Btw, thanks go to Garrett Wollman for suggesting the change to > > MJUMPAGESIZE > > clusters. > > > > rick > > ps: If the attachment doesn't make it through and you want the > > patch, just > > email me and I'll send you a copy. > > The patch looks good, but we probably shouldn't change _readlink.. > The chances of a link being >2k are pretty slim, and the chances of > the link being >32k are even smaller... > Yea, I already thought of that, actually. However, see below w.r.t. NFSv4. However, at this point I mostly want to find out if it the long mbuf chain that causes problems for TSO enabled network interfaces. > In fact, we might want to switch _readlink to MGET (could be > conditional > upon cnt) so that if it fits in an mbuf we don't allocate a cluster > for > it... > For NFSv4, what was an RPC for NFSv3 becomes one of several Ops. in a compound RPC. As such, there is no way to know how much additional RPC message there will be. So, although the readlink reply won't use much of the 4K allocation, replies for subsequent Ops. in the compound certainly could. (Is it more efficient to allocate 4K now and use part of it for subsequent message reply stuff or allocate additional mbuf clusters later for subsequent stuff, as required? On a small memory constrained machine, I suspect the latter is correct, but for the kind of hardware that has TSO scatter/gather enabled network interfaces, I'm not so sure. At this point, I wouldn't even say that using 4K clusters is going to be a win and my hunch is that any win wouldn't apply to small memory constrained machines.) My test server has 256Mbytes of ram and it certainly doesn't show any improvement (big surprise;-), but it also doesn't show any degradation for the limited testing I've done. Again, my main interest at this point is whether reducing the number of mbufs in the chain fixes the TSO issues. I think the question of whether or not 4K clusters are performance improvement in general, is an interesting one that comes later. rick > -- > John-Mark Gurney Voice: +1 415 225 5579 > > "All that I will do, has been done, All that I have, has not." > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?222089865.17245782.1390866430479.JavaMail.root>