From owner-freebsd-fs@freebsd.org Tue Feb 2 18:48:35 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1AE5A99F80; Tue, 2 Feb 2016 18:48:35 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 887301BD4; Tue, 2 Feb 2016 18:48:35 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id u12ImDES067799; Tue, 2 Feb 2016 10:48:18 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201602021848.u12ImDES067799@gw.catspoiler.org> Date: Tue, 2 Feb 2016 10:48:13 -0800 (PST) From: Don Lewis Subject: Re: NFS unstable with high load on server To: spork@bway.net cc: woodsb02@gmail.com, freebsd-fs@freebsd.org, vivek@khera.org, freebsd-questions@freebsd.org In-Reply-To: <5EAD4A4A-211F-451E-A3B9-752DAC6D94B4@bway.net> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=iso-8859-13 Content-Transfer-Encoding: 8BIT X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 18:48:35 -0000 On 2 Feb, Charles Sprickman wrote: > On Feb 2, 2016, at 1:10 AM, Ben Woods wrote: >> >> On Monday, 1 February 2016, Vick Khera wrote: >> >>> I have a handful of servers at my data center all running FreeBSD 10.2. On >>> one of them I have a copy of the FreeBSD sources shared via NFS. When this >>> server is running a large poudriere run re-building all the ports I need, >>> the clients' NFS mounts become unstable. That is, the clients keep getting >>> read failures. The interactive performance of the NFS server is just fine, >>> however. The local file system is a ZFS mirror. >>> >>> What could be causing NFS to be unstable in this situation? >>> >>> Specifics: >>> >>> Server "lorax" FreeBSD 10.2-RELEASE-p7 kernel locally compiled, with NFS >>> server and ZFS as dynamic kernel modules. 16GB RAM, Xeon 3.1GHz quad >>> processor. >>> >>> The directory /u/lorax1 a ZFS dataset on a mirrored pool, and is NFS >>> exported via the ZFS exports file. I put the FreeBSD sources on this >>> dataset and symlink to /usr/src. >>> >>> >>> Client "bluefish" FreeBSD 10.2-RELEASE-p5 kernel locally compiled, NFS >>> client built in to kernel. 32GB RAM, Xeon 3.1GHz quad processor (basically >>> same hardware but more RAM). >>> >>> The directory /n/lorax1 is NFS mounted from lorax via autofs. The NFS >>> options are "intr,nolockd". /usr/src is symlinked to the sources in that >>> NFS mount. >>> >>> >>> What I observe: >>> >>> [lorax]~% cd /usr/src >>> [lorax]src% svn status >>> [lorax]src% w >>> 9:12AM up 12 days, 19:19, 4 users, load averages: 4.43, 4.45, 3.61 >>> USER TTY FROM LOGIN@ IDLE WHAT >>> vivek pts/0 vick.int.kcilink.com 8:44AM - tmux: client >>> (/tmp/ >>> vivek pts/1 tmux(19747).%0 8:44AM 19 sed >>> y%*+%pp%;s%[^_a >>> vivek pts/2 tmux(19747).%1 8:56AM - w >>> vivek pts/3 tmux(19747).%2 8:56AM - slogin >>> bluefish-prv >>> [lorax]src% pwd >>> /u/lorax1/usr10/src >>> >>> So right now the load average is more than 1 per processor on lorax. I can >>> quite easily run "svn status" on the source directory, and the interactive >>> performance is pretty snappy for editing local files and navigating around >>> the file system. >>> >>> >>> On the client: >>> >>> [bluefish]~% cd /usr/src >>> [bluefish]src% pwd >>> /n/lorax1/usr10/src >>> [bluefish]src% svn status >>> svn: E070008: Can't read directory '/n/lorax1/usr10/src/contrib/sqlite3': >>> Partial results are valid but processing is incomplete >>> [bluefish]src% svn status >>> svn: E070008: Can't read directory '/n/lorax1/usr10/src/lib/libfetch': >>> Partial results are valid but processing is incomplete >>> [bluefish]src% svn status >>> svn: E070008: Can't read directory >>> '/n/lorax1/usr10/src/release/picobsd/tinyware/msg': Partial results are >>> valid but processing is incomplete >>> [bluefish]src% w >>> 9:14AM up 93 days, 23:55, 1 user, load averages: 0.10, 0.15, 0.15 >>> USER TTY FROM LOGIN@ IDLE WHAT >>> vivek pts/0 lorax-prv.kcilink.com 8:56AM - w >>> [bluefish]src% df . >>> Filesystem 1K-blocks Used Avail Capacity Mounted on >>> lorax-prv:/u/lorax1 932845181 6090910 926754271 1% /n/lorax1 >>> >>> >>> What I see is more or less random failures to read the NFS volume. When the >>> server is not so busy running poudriere builds, the client never has any >>> failures. >>> >>> I also observe this kind of failure doing buildworld or installworld on >>> the client when the server is busy -- I get strange random failures reading >>> the files causing the build or install to fail. >>> >>> My workaround is to not do build/installs on client machines when the NFS >>> server is busy doing large jobs like building all packages, but there is >>> definitely something wrong here I'd like to fix. I observe this on all the >>> local NFS clients. I rebooted the server before to try to clear this up but >>> it did not fix it. >>> >>> Any help would be appreciated. >>> >> >> I just wanted to point out that I am experiencing this exact same issue in >> my home setup. >> >> Performing an installworld from an NFS mount works perfectly, until I start >> running poudriere on the NFS server. Then I start getting NFS timeouts and >> the installworld fails. >> >> The NFS server is also using ZFS, but the NFS export in my case is being >> done via the ZFS property "sharenfs" (I am not using the /etc/exports file). > > Me three. Iÿm actually updating a small group of servers now and started > blowing up my installworlds by trying to do some poudriere builds at the same > time. Very repeatable. Of note, Iÿm on 9.3, and saw this on 8.4 as well. If I > track down the client-side failures, itÿs always ´permission denied¡. That sort of sounds like the problem that was fixed in HEAD with r241561 and r241568. It was merged to 9-STABLE before 9.3-RELEASE. Try adding the -S option to mountd_flags. I have no idea why that isn't the default. When poudriere is running, it frequently mounts and unmounts filesystems. When this happens, mount(8) and umount(8) notify mountd to update the exports list. This is not done atomically so NFS transactions can fail while the mountd updates the export list. The fix mentioned above pauses the nfsd threads while the export list update is in progress to prevent the problem. I don't know how this works with ZFS sharenfs, though.