From owner-freebsd-stable@FreeBSD.ORG Fri Jul 25 11:11:22 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8898FB9 for ; Fri, 25 Jul 2014 11:11:22 +0000 (UTC) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 13E4C2392 for ; Fri, 25 Jul 2014 11:11:21 +0000 (UTC) Received: from mh0.gentlemail.de (mh0.gentlemail.de [IPv6:2a00:e10:2800::a135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id s6PBBJgl070218; Fri, 25 Jul 2014 13:11:20 +0200 (CEST) (envelope-from h.schmalzbauer@omnilan.de) Received: from titan.inop.mo1.omnilan.net (titan.inop.mo1.omnilan.net [IPv6:2001:a60:f0bb:1::3:1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id 89DA93795; Fri, 25 Jul 2014 13:11:19 +0200 (CEST) Message-ID: <53D23B57.8020208@omnilan.de> Date: Fri, 25 Jul 2014 13:11:19 +0200 From: Harald Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: Rick Macklem Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel References: <1466875806.3196288.1406284711139.JavaMail.root@uoguelph.ca> In-Reply-To: <1466875806.3196288.1406284711139.JavaMail.root@uoguelph.ca> X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig6BF2466DEC75AAEEEF472C52" X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]); Fri, 25 Jul 2014 13:11:20 +0200 (CEST) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: ; Sender-helo: mh0.gentlemail.de; ) Cc: freebsd-stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jul 2014 11:11:22 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6BF2466DEC75AAEEEF472C52 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Bez=C3=BCglich Rick Macklem's Nachricht vom 25.07.2014 12:38 (localtime)= : > Harald Schmalzbauer wrote: >> Bez=C3=BCglich Rick Macklem's Nachricht vom 25.07.2014 02:14 (localtim= e): >>> Harald Schmalzbauer wrote: >>>> Bez=C3=BCglich Rick Macklem's Nachricht vom 08.08.2013 14:20 >>>> (localtime): >>>>> Lars Eggert wrote: >>>>>> Hi, >>>>>> >>>>>> every few days or so, my -STABLE NFS server (v3 and v4) gets >>>>>> wedged >>>>>> with a ton of messages about "nfsd server cache flooded, try to >>>>>> increase nfsrc_floodlevel" in the log, and nfsstat shows TCPPeak >>>>>> at >>>>>> 16385. It requires a reboot to unwedge, restarting the server >>>>>> does >>>>>> not help. >>>>>> >>>>>> The clients are (mostly) six -CURRENT nfsv4 boxes that netboot >>>>>> from >>>>>> the server and mount all drives from there. >>>>>> >>> Have you tried increasing vfs.nfsd.tcphighwater? >>> This needs to be increased to increase the flood level above 16384. >>> >>> Garrett Wollman sets: >>> vfs.nfsd.tcphighwater=3D100000 >>> vfs.nfsd.tcpcachetimeo=3D300 >>> >>> or something like that, if I recall correctly. >> Thanks you for your help! >> >> I read about tuning these sysctls, but I object individually altering >> these, because I don't have hundreds of clients torturing a poor >> server >> or any other not well balanced setup. >> I run into this problem with one client, connected via 1GbE (not 10 >> or >> 40GbE) link, talking to modern server with 10G RAM - and this >> environment forces me to reboot the storage server every 2nd day. >> IMHO such a setup shouldn't require manual tuning and I consider this >> as >> a really urgent problem! > Btw, what you can do to help with this is experiment with the tunable > and if you find a setting that works well for your server, report that > back as a data point that can be used for this. > > If you make it too large, the server runs out of address space that > can be used by malloc() and that results in the whole machine being > wedged and not just the NFS server. I'd happily provide experience results, but I see my environment (the only one I reintroduced nfs atm.) as uncommon, because few LANs out there have NFS services with just two clientes, where only one does really use nfs. So before tuning sysctls in other production environments than my own (small and uncommon) setup, I need to be prooven that nfs is usable these days (v4). If the noopen.patch prooves to be one possibility to stabilize things, I'll be able to find out optimized settings of vfs.nfsd.tcp*. Then I could have the patched kernel in addation, which I need to be able to ensure reliable service. Additinally I should first read somhere what they are doing to get the right understanding=E2=80=A6 Thanks, -Harry P.S.: I'd happily donate some used GbE switch+server if that helps! --------------enig6BF2466DEC75AAEEEF472C52 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAlPSO1cACgkQLDqVQ9VXb8iH9gCfffWM0OMO1RuiSJ4tsM0Dtx1U f+IAn3yyjKEf35NzGm1aIaen6O4LiOI4 =gYG4 -----END PGP SIGNATURE----- --------------enig6BF2466DEC75AAEEEF472C52--