From owner-freebsd-current@FreeBSD.ORG Wed Nov 2 16:46:59 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C7E0F16A41F for ; Wed, 2 Nov 2005 16:46:59 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6E93D43D46 for ; Wed, 2 Nov 2005 16:46:59 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 4E45C46B4F; Wed, 2 Nov 2005 11:46:54 -0500 (EST) Date: Wed, 2 Nov 2005 16:46:54 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Gavin Atkinson In-Reply-To: <1130945849.51544.42.camel@buffy.york.ac.uk> Message-ID: <20051102164157.A18382@fledge.watson.org> References: <1130943516.51544.34.camel@buffy.york.ac.uk> <20051102152322.GF93549@cicely12.cicely.de> <1130945849.51544.42.camel@buffy.york.ac.uk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@freebsd.org, ticso@cicely.de Subject: Re: Poor NFS server performance in 6.0 with SMP and mpsafenet=1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Nov 2005 16:46:59 -0000 On Wed, 2 Nov 2005, Gavin Atkinson wrote: > On Wed, 2005-11-02 at 16:23 +0100, Bernd Walter wrote: >> On Wed, Nov 02, 2005 at 02:58:36PM +0000, Gavin Atkinson wrote: >>> I'm seeing incredibly poor performance when serving files from an SMP >>> FreeBSD 6.0RC1 server to a Solaris 10 client. I've done some >>> experimenting and have discovered that either removing SMP from the >>> kernel, or setting debug.mpsafenet=0 in loader.conf massively improves >>> the speed. Switching preemption off seems to also help. >>> >>> No SMP, mpsafenet=1 59.4 >>> No SMP, mpsafenet=0 49.4 >>> No SMP, mpsafenet=1, no PREEMPTION 53.1 >>> No SMP, mpsafenet=0, no PREEMPTION 51.9 >>> SMP, mpsafenet=1 351.7 >>> SMP, mpsafenet=0 74.5 >>> SMP, mpsafenet=1, no PREEMPTION 264.9 >>> SMP, mpsafenet=0, no PREEMPTION 53.7 >> >> Which scheduler? > > BSD. As I say, I'm running 6.0-RC1 with the standard GENERIC kernel, > apart from the options I have listed as being changed above. Polling is > therefore also not enabled. > > When I get home, I'll have a play with both ULE and POLLING to see what > difference they make, however ideally I'd like to not use polling in > production if possible. This does sound like a scheduling problem. I realize it's time-consuming, but would it be possible to have you run each of the above test cases twice more (or maybe even once) to confirm that in each case, the result is reproduceable? I've recently been looking at a scheduling problem relating to PREEMPTION and the netisr for loopback traffic, and is basically a result of poorly timed context switching ending up being a worst cast scenario. I suspect something similar is likely here. Have you tried varying the number of nfsd worker threads on the server to see how that changes matters? Robert N M Watson