Date: Wed, 2 Nov 2005 16:46:54 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk> Cc: freebsd-current@freebsd.org, ticso@cicely.de Subject: Re: Poor NFS server performance in 6.0 with SMP and mpsafenet=1 Message-ID: <20051102164157.A18382@fledge.watson.org> In-Reply-To: <1130945849.51544.42.camel@buffy.york.ac.uk> References: <1130943516.51544.34.camel@buffy.york.ac.uk> <20051102152322.GF93549@cicely12.cicely.de> <1130945849.51544.42.camel@buffy.york.ac.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 2 Nov 2005, Gavin Atkinson wrote: > On Wed, 2005-11-02 at 16:23 +0100, Bernd Walter wrote: >> On Wed, Nov 02, 2005 at 02:58:36PM +0000, Gavin Atkinson wrote: >>> I'm seeing incredibly poor performance when serving files from an SMP >>> FreeBSD 6.0RC1 server to a Solaris 10 client. I've done some >>> experimenting and have discovered that either removing SMP from the >>> kernel, or setting debug.mpsafenet=0 in loader.conf massively improves >>> the speed. Switching preemption off seems to also help. >>> >>> No SMP, mpsafenet=1 59.4 >>> No SMP, mpsafenet=0 49.4 >>> No SMP, mpsafenet=1, no PREEMPTION 53.1 >>> No SMP, mpsafenet=0, no PREEMPTION 51.9 >>> SMP, mpsafenet=1 351.7 >>> SMP, mpsafenet=0 74.5 >>> SMP, mpsafenet=1, no PREEMPTION 264.9 >>> SMP, mpsafenet=0, no PREEMPTION 53.7 >> >> Which scheduler? > > BSD. As I say, I'm running 6.0-RC1 with the standard GENERIC kernel, > apart from the options I have listed as being changed above. Polling is > therefore also not enabled. > > When I get home, I'll have a play with both ULE and POLLING to see what > difference they make, however ideally I'd like to not use polling in > production if possible. This does sound like a scheduling problem. I realize it's time-consuming, but would it be possible to have you run each of the above test cases twice more (or maybe even once) to confirm that in each case, the result is reproduceable? I've recently been looking at a scheduling problem relating to PREEMPTION and the netisr for loopback traffic, and is basically a result of poorly timed context switching ending up being a worst cast scenario. I suspect something similar is likely here. Have you tried varying the number of nfsd worker threads on the server to see how that changes matters? Robert N M Watson
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051102164157.A18382>