From owner-freebsd-current@FreeBSD.ORG Mon Jul 27 01:05:30 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B0A5E1065672 for ; Mon, 27 Jul 2009 01:05:30 +0000 (UTC) (envelope-from freebsd-current@chrishedley.com) Received: from mail.chrishedley.com (77-44-98-139.xdsl.murphx.net [77.44.98.139]) by mx1.freebsd.org (Postfix) with ESMTP id 54BA48FC08 for ; Mon, 27 Jul 2009 01:05:30 +0000 (UTC) (envelope-from freebsd-current@chrishedley.com) Received: from localhost (localhost [127.0.0.1]) by mail.chrishedley.com (Postfix) with ESMTP id A254B6DFB9; Mon, 27 Jul 2009 02:05:24 +0100 (BST) X-Virus-Scanned: amavisd-new at chrishedley.com Received: from mail.chrishedley.com ([127.0.0.1]) by localhost (mail.chrishedley.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 5lW67XmcJ2HF; Mon, 27 Jul 2009 02:05:11 +0100 (BST) Received: from teapot.cbhnet (teapot.cbhnet [192.168.1.1]) by mail.chrishedley.com (Postfix) with ESMTP id 52B8F6DFA2; Mon, 27 Jul 2009 02:05:11 +0100 (BST) Date: Mon, 27 Jul 2009 02:05:11 +0100 (BST) From: Chris Hedley X-X-Sender: cbh@teapot.cbhnet To: Matthew Dillon In-Reply-To: <200907222340.n6MNe3K5013221@apollo.backplane.com> Message-ID: References: <200907222307.n6MN7YhU012788@apollo.backplane.com> <200907222340.n6MNe3K5013221@apollo.backplane.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@freebsd.org Subject: Re: Linux NFS ate my bge X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2009 01:05:30 -0000 On Wed, 22 Jul 2009, Matthew Dillon wrote: > TCP will likely work better, for several reasons, not the least of > which being that the NFS client does not have to estimate a retransmit > timeout on a rpc-by-rpc basis. Such estimations fail utterly in the > face of a large number of concurrent RPCs because latency winds up being > governed by the disk backlog on the server. A UDP mount will wind up > retransmitting even under completely lossless conditions. > > Another reason TCP tends to work better is that UDP uses IP fragmentation > and IP fragmentation reassembly is not typically in the critical path. > The desired NFS filesystem block size is 16K (smaller will typically > reduce performance), so even a 9000 MTU won't help. It's interesting how this flies in the face of the assumptions I'd made: I'd just guessed that UDP would somehow be the better option, I think I'd had some vague idea it might somehow be more suited to fragmented file chunks being squirted over the network and TCP being a compromise. Well my somewhat wonky assumptions aside, changing over to TCP seems to have fixed it: I haven't seen the problem rematerialise even with a much more protracted network loading than before (essentially emerging [I use Gentoo for Linux] an update of pretty much everything). The performance is better; it could still do with some serious improvement as it's a lot more sluggish than is ideal, though I suspect that the fault lies at the Linux end. Though it may be my configuration options. Of course it took me two attempts to get TCP configured: I'd completely forgotten that I can't simply change it in fstab (I wasn't having a good day when it came to being insightful!) and had to change the entry in the pxelinux config to tell it to use TCP. But I got there in the end, so thank you. :) > Also use netstat ... not sure what option, I think -x, to determine the > actual size of the socket buffer being employed for the connection > (TCP or UDP). There are multiple internal caps in the kernel and it > is often not as big as you might have thought it should be. You want > a 256KB socket buffer at a minimum for a GigE network. Smaller works > (at least for linear transfers), but you lose a lot of RPC concurrency > from the client. Again, something that matters more for a linux client > vs a FreeBSD client. I think this will be my next port of call in order to hopefully get the performance up to a better standard, but there's time for experimenting with that. For now, I'm just happy that my FreeBSD system no longer locks up when being bombarded with requests! Cheers, Chris.