From owner-freebsd-stable@FreeBSD.ORG Fri Jul 25 11:14:03 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EDEDC2F6 for ; Fri, 25 Jul 2014 11:14:03 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id B421723CA for ; Fri, 25 Jul 2014 11:14:03 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqkKAJE70lODaFve/2dsb2JhbABZFoNKVwEDgnTHDIdFAYEnd4QDAQEEASNCFAUWDgoCAg0ZAlkGiE0IDagLl0AXgSyMNoEgFTQHgniBUQWOOYhvhXGKIIhYg2QhMIEDQQ X-IronPort-AV: E=Sophos;i="5.01,730,1400040000"; d="scan'208";a="144418237" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 25 Jul 2014 07:14:02 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 699C9B3F1A; Fri, 25 Jul 2014 07:14:02 -0400 (EDT) Date: Fri, 25 Jul 2014 07:14:02 -0400 (EDT) From: Rick Macklem To: Harald Schmalzbauer Message-ID: <2146856958.3199855.1406286842423.JavaMail.root@uoguelph.ca> In-Reply-To: <53D20A49.5020803@omnilan.de> Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jul 2014 11:14:04 -0000 Harald Schmalzbauer wrote: > Bez=C3=BCglich Rick Macklem's Nachricht vom 25.07.2014 02:14 (localtime): > > Harald Schmalzbauer wrote: > >> Bez=C3=BCglich Rick Macklem's Nachricht vom 08.08.2013 14:20 > >> (localtime): > >>> Lars Eggert wrote: > >>>> Hi, > >>>> > >>>> every few days or so, my -STABLE NFS server (v3 and v4) gets > >>>> wedged > >>>> with a ton of messages about "nfsd server cache flooded, try to > >>>> increase nfsrc_floodlevel" in the log, and nfsstat shows TCPPeak > >>>> at > >>>> 16385. It requires a reboot to unwedge, restarting the server > >>>> does > >>>> not help. > >>>> > >>>> The clients are (mostly) six -CURRENT nfsv4 boxes that netboot > >>>> from > >>>> the server and mount all drives from there. > >>>> > > Have you tried increasing vfs.nfsd.tcphighwater? > > This needs to be increased to increase the flood level above 16384. > > > > Garrett Wollman sets: > > vfs.nfsd.tcphighwater=3D100000 > > vfs.nfsd.tcpcachetimeo=3D300 > > > > or something like that, if I recall correctly. >=20 > Thanks you for your help! >=20 > I read about tuning these sysctls, but I object individually altering > these, because I don't have hundreds of clients torturing a poor > server > or any other not well balanced setup. > I run into this problem with one client, connected via 1GbE (not 10 > or > 40GbE) link, talking to modern server with 10G RAM - and this > environment forces me to reboot the storage server every 2nd day. > IMHO such a setup shouldn't require manual tuning and I consider this > as > a really urgent problem! > Whatever causes the server to lock up is strongly required to be > fixed > for next release, > otherwise the shipped implementation of NFS is not really suitable > for > production environment and needs a warning message when enabled. > The impact of this failure forces admins to change the operation > system > in order to get a core service back into operation. > The importance is, that I don't suffer from weaker performance or > lags/delays, but my server stops NFS completely and only a reboot > solves > this situation. >=20 Btw, you can increase vfs.nfsd.tcphighwater on the fly when it wedges and avoid having to reboot. > Are there later modifcations or other findings which are known to > obsolete > your noopen.patch (http://people.freebsd.org/~rmacklem/noopen.patch)? >=20 > I'm testing this atm, but having other panics on the same machine > related to vfs locking, so results of the test won't be available too > soon. >=20 > Thank you, >=20 > -Harry >=20 >=20 >=20