From owner-freebsd-stable@FreeBSD.ORG Fri Jul 25 07:42:05 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0D60EC8E for ; Fri, 25 Jul 2014 07:42:05 +0000 (UTC) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A4BB02015 for ; Fri, 25 Jul 2014 07:42:04 +0000 (UTC) Received: from mh0.gentlemail.de (ezra.dcm1.omnilan.net [78.138.80.135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id s6P7g2vE068249; Fri, 25 Jul 2014 09:42:02 +0200 (CEST) (envelope-from h.schmalzbauer@omnilan.de) Received: from titan.inop.mo1.omnilan.net (titan.inop.mo1.omnilan.net [IPv6:2001:a60:f0bb:1::3:1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id 35F413731; Fri, 25 Jul 2014 09:42:02 +0200 (CEST) Message-ID: <53D20A49.5020803@omnilan.de> Date: Fri, 25 Jul 2014 09:42:01 +0200 From: Harald Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: Rick Macklem Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel References: <1327388853.3033655.1406247242764.JavaMail.root@uoguelph.ca> In-Reply-To: <1327388853.3033655.1406247242764.JavaMail.root@uoguelph.ca> X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig1B596AD85497A1C3986BEBD9" X-Greylist: ACL 119 matched, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [78.138.80.130]); Fri, 25 Jul 2014 09:42:02 +0200 (CEST) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: 78.138.80.135; Sender-helo: mh0.gentlemail.de; ) Cc: freebsd-stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jul 2014 07:42:05 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig1B596AD85497A1C3986BEBD9 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Bez=C3=BCglich Rick Macklem's Nachricht vom 25.07.2014 02:14 (localtime)= : > Harald Schmalzbauer wrote: >> Bez=C3=BCglich Rick Macklem's Nachricht vom 08.08.2013 14:20 (localtim= e): >>> Lars Eggert wrote: >>>> Hi, >>>> >>>> every few days or so, my -STABLE NFS server (v3 and v4) gets >>>> wedged >>>> with a ton of messages about "nfsd server cache flooded, try to >>>> increase nfsrc_floodlevel" in the log, and nfsstat shows TCPPeak >>>> at >>>> 16385. It requires a reboot to unwedge, restarting the server does >>>> not help. >>>> >>>> The clients are (mostly) six -CURRENT nfsv4 boxes that netboot >>>> from >>>> the server and mount all drives from there. >>>> > Have you tried increasing vfs.nfsd.tcphighwater? > This needs to be increased to increase the flood level above 16384. > > Garrett Wollman sets: > vfs.nfsd.tcphighwater=3D100000 > vfs.nfsd.tcpcachetimeo=3D300 > > or something like that, if I recall correctly. Thanks you for your help! I read about tuning these sysctls, but I object individually altering these, because I don't have hundreds of clients torturing a poor server or any other not well balanced setup. I run into this problem with one client, connected via 1GbE (not 10 or 40GbE) link, talking to modern server with 10G RAM - and this environment forces me to reboot the storage server every 2nd day. IMHO such a setup shouldn't require manual tuning and I consider this as a really urgent problem! Whatever causes the server to lock up is strongly required to be fixed for next release, otherwise the shipped implementation of NFS is not really suitable for production environment and needs a warning message when enabled. The impact of this failure forces admins to change the operation system in order to get a core service back into operation. The importance is, that I don't suffer from weaker performance or lags/delays, but my server stops NFS completely and only a reboot solves this situation. Are there later modifcations or other findings which are known to obsolet= e your noopen.patch (http://people.freebsd.org/~rmacklem/noopen.patch)? I'm testing this atm, but having other panics on the same machine related to vfs locking, so results of the test won't be available too soo= n. Thank you, -Harry --------------enig1B596AD85497A1C3986BEBD9 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAlPSCkkACgkQLDqVQ9VXb8jATQCgl9/tYE9u+sKP8e7zyqCSX4HC MA0AnjpN4eJpBxFS5Jl+WzB0HbEFE3cX =URAE -----END PGP SIGNATURE----- --------------enig1B596AD85497A1C3986BEBD9--