From owner-freebsd-stable@FreeBSD.ORG Wed Jan 14 08:08:47 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 47DE816A4CE for ; Wed, 14 Jan 2004 08:08:47 -0800 (PST) Received: from mutare.noc.clara.net (mutare.noc.clara.net [195.8.70.95]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1663843DA4 for ; Wed, 14 Jan 2004 08:07:50 -0800 (PST) (envelope-from ollie@mutare.noc.clara.net) Received: from ollie by mutare.noc.clara.net with local (Exim 4.24) id 1AgnYa-0001ZD-PA; Wed, 14 Jan 2004 16:07:48 +0000 Date: Wed, 14 Jan 2004 16:07:48 +0000 From: Ollie Cook To: Doug White Message-ID: <20040114160748.GJ27744@mutare.noc.clara.net> References: <20040113154932.GE354@mutare.noc.clara.net> <20040113114525.L63732@carver.gumbysoft.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040113114525.L63732@carver.gumbysoft.com> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 4.9-STABLE i386 X-NCC-RegID: uk.claranet Sender: Ollie Cook cc: freebsd-stable@freebsd.org Subject: Re: nfs send errors 32 and 35 on RELENG_4 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Jan 2004 16:08:47 -0000 On Tue, Jan 13, 2004 at 11:50:41AM -0800, Doug White wrote: > > Jan 13 14:02:02 mese /kernel: nfs server 192.168.1.1:/vol/vol1/claramail: not responding > > Jan 13 14:02:03 mese /kernel: nfs server 192.168.1.1:/vol/vol1/claramail: is alive again > > There's some tuning options for this, which I don't immediately recall. > Under heavy load these are somewhat normal. Hi Doug, I have set the kernel to auto-scale nmbclusters based on the memory in the host in question. I think it's not worth hard-coding these values since the peak value seems not to get near the maximum: root@mese:[conf] (10) # netstat -m 397/1200/34816 mbufs in use (current/peak/max): 294 mbufs allocated to data 103 mbufs allocated to packet headers 215/676/8704 mbuf clusters in use (current/peak/max) 1652 Kbytes allocated to network (6% of mb_map in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines If peak does get close to max, I will increase the number of nmbclusters, but it doesn't look necessary at present. All sysctls are at default values. Are there other things I can be looking at tuning? These hosts do approximately 500 NFS operations each per second (appx 5Mbit/s). > > Jan 13 14:09:37 mese /kernel: nfs send error 35 for server 192.168.1.1:/vol/vol1/claramail > > Jan 13 14:09:53 mese /kernel: nfs send error 32 for server 192.168.1.1:/vol/vol1/claramail > > These errors tend to imply resource shortages. Monitor netstat -m output > and make sure you aren't running out of mbuf or mbuf clusters. Also check > for network errors and dropped packets (netstat -s, switch statistics). Prior to power cycling the box I was able to look at netstat -m, and netstat -i, neither of which showed anything to worry about. Next time it happens I'll be sure to take a copy of the output, in case there's something I'm missing. Is there any way of finding out what error 35 actually means? > Are you running rpc.lockd? The server is an Network Appliance F-Series filer which runs a locking manager: root@metis:[conf] (13) # rpcinfo -p 192.168.1.1 | grep lock 100021 4 tcp 607 nlockmgr 100021 3 tcp 607 nlockmgr 100021 1 tcp 607 nlockmgr 100021 4 udp 606 nlockmgr 100021 3 udp 606 nlockmgr 100021 1 udp 606 nlockmgr Thank you for your help so far. Cheers, Ollie -- Oliver Cook Systems Administrator, Claranet UK ollie@uk.clara.net +44 20 7903 3065