Date: Tue, 2 Feb 2016 17:41:31 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: Don Lewis <truckman@FreeBSD.org> Cc: spork@bway.net, freebsd-fs@freebsd.org, vivek@khera.org, freebsd-questions@freebsd.org Subject: Re: NFS unstable with high load on server Message-ID: <1270648257.999240.1454452891099.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <201602021848.u12ImDES067799@gw.catspoiler.org> References: <201602021848.u12ImDES067799@gw.catspoiler.org>
next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_999238_496640285.1454452891098 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Don Lewis wrote: > On 2 Feb, Charles Sprickman wrote: > > On Feb 2, 2016, at 1:10 AM, Ben Woods <woodsb02@gmail.com> wrote: > >>=20 > >> On Monday, 1 February 2016, Vick Khera <vivek@khera.org> wrote: > >>=20 > >>> I have a handful of servers at my data center all running FreeBSD 10.= 2. > >>> On > >>> one of them I have a copy of the FreeBSD sources shared via NFS. When > >>> this > >>> server is running a large poudriere run re-building all the ports I n= eed, > >>> the clients' NFS mounts become unstable. That is, the clients keep > >>> getting > >>> read failures. The interactive performance of the NFS server is just > >>> fine, > >>> however. The local file system is a ZFS mirror. > >>>=20 > >>> What could be causing NFS to be unstable in this situation? > >>>=20 > >>> Specifics: > >>>=20 > >>> Server "lorax" FreeBSD 10.2-RELEASE-p7 kernel locally compiled, with = NFS > >>> server and ZFS as dynamic kernel modules. 16GB RAM, Xeon 3.1GHz quad > >>> processor. > >>>=20 > >>> The directory /u/lorax1 a ZFS dataset on a mirrored pool, and is NFS > >>> exported via the ZFS exports file. I put the FreeBSD sources on this > >>> dataset and symlink to /usr/src. > >>>=20 > >>>=20 > >>> Client "bluefish" FreeBSD 10.2-RELEASE-p5 kernel locally compiled, NF= S > >>> client built in to kernel. 32GB RAM, Xeon 3.1GHz quad processor > >>> (basically > >>> same hardware but more RAM). > >>>=20 > >>> The directory /n/lorax1 is NFS mounted from lorax via autofs. The NFS > >>> options are "intr,nolockd". /usr/src is symlinked to the sources in t= hat > >>> NFS mount. > >>>=20 > >>>=20 > >>> What I observe: > >>>=20 > >>> [lorax]~% cd /usr/src > >>> [lorax]src% svn status > >>> [lorax]src% w > >>> 9:12AM up 12 days, 19:19, 4 users, load averages: 4.43, 4.45, 3.61 > >>> USER TTY FROM LOGIN@ IDLE WHAT > >>> vivek pts/0 vick.int.kcilink.com 8:44AM - tmux: clie= nt > >>> (/tmp/ > >>> vivek pts/1 tmux(19747).%0 8:44AM 19 sed > >>> y%*+%pp%;s%[^_a > >>> vivek pts/2 tmux(19747).%1 8:56AM - w > >>> vivek pts/3 tmux(19747).%2 8:56AM - slogin > >>> bluefish-prv > >>> [lorax]src% pwd > >>> /u/lorax1/usr10/src > >>>=20 > >>> So right now the load average is more than 1 per processor on lorax. = I > >>> can > >>> quite easily run "svn status" on the source directory, and the > >>> interactive > >>> performance is pretty snappy for editing local files and navigating > >>> around > >>> the file system. > >>>=20 > >>>=20 > >>> On the client: > >>>=20 > >>> [bluefish]~% cd /usr/src > >>> [bluefish]src% pwd > >>> /n/lorax1/usr10/src > >>> [bluefish]src% svn status > >>> svn: E070008: Can't read directory '/n/lorax1/usr10/src/contrib/sqlit= e3': > >>> Partial results are valid but processing is incomplete > >>> [bluefish]src% svn status > >>> svn: E070008: Can't read directory '/n/lorax1/usr10/src/lib/libfetch'= : > >>> Partial results are valid but processing is incomplete > >>> [bluefish]src% svn status > >>> svn: E070008: Can't read directory > >>> '/n/lorax1/usr10/src/release/picobsd/tinyware/msg': Partial results a= re > >>> valid but processing is incomplete > >>> [bluefish]src% w > >>> 9:14AM up 93 days, 23:55, 1 user, load averages: 0.10, 0.15, 0.15 > >>> USER TTY FROM LOGIN@ IDLE WHAT > >>> vivek pts/0 lorax-prv.kcilink.com 8:56AM - w > >>> [bluefish]src% df . > >>> Filesystem 1K-blocks Used Avail Capacity Mounted on > >>> lorax-prv:/u/lorax1 932845181 6090910 926754271 1% /n/lorax1 > >>>=20 > >>>=20 > >>> What I see is more or less random failures to read the NFS volume. Wh= en > >>> the > >>> server is not so busy running poudriere builds, the client never has = any > >>> failures. > >>>=20 > >>> I also observe this kind of failure doing buildworld or installworld= on > >>> the client when the server is busy -- I get strange random failures > >>> reading > >>> the files causing the build or install to fail. > >>>=20 > >>> My workaround is to not do build/installs on client machines when the= NFS > >>> server is busy doing large jobs like building all packages, but there= is > >>> definitely something wrong here I'd like to fix. I observe this on al= l > >>> the > >>> local NFS clients. I rebooted the server before to try to clear this = up > >>> but > >>> it did not fix it. > >>>=20 > >>> Any help would be appreciated. > >>>=20 > >>=20 > >> I just wanted to point out that I am experiencing this exact same issu= e in > >> my home setup. > >>=20 > >> Performing an installworld from an NFS mount works perfectly, until I > >> start > >> running poudriere on the NFS server. Then I start getting NFS timeouts= and > >> the installworld fails. > >>=20 > >> The NFS server is also using ZFS, but the NFS export in my case is bei= ng > >> done via the ZFS property "sharenfs" (I am not using the /etc/exports > >> file). > >=20 > > Me three. I=E2=80=99m actually updating a small group of servers now a= nd started > > blowing up my installworlds by trying to do some poudriere builds at th= e > > same > > time. Very repeatable. Of note, I=E2=80=99m on 9.3, and saw this on 8= .4 as well. > > If I > > track down the client-side failures, it=E2=80=99s always =E2=80=9Cpermi= ssion denied=E2=80=9D. >=20 > That sort of sounds like the problem that was fixed in HEAD with r241561 > and r241568. It was merged to 9-STABLE before 9.3-RELEASE. Try adding > the -S option to mountd_flags. I have no idea why that isn't the > default. >=20 It isn't the default because... - The first time I proposed it, the consensus was that it wasn't the correct fix and it shouldn't go in FreeBSD. - About 2 years later, folks agreed that it was ok as an interim solution, so I made it a non-default option. --> This voids it being considered a POLA violation. Maybe in a couple more years it can become the default? > When poudriere is running, it frequently mounts and unmounts > filesystems. When this happens, mount(8) and umount(8) notify mountd to > update the exports list. This is not done atomically so NFS > transactions can fail while the mountd updates the export list. The fix > mentioned above pauses the nfsd threads while the export list update is > in progress to prevent the problem. >=20 > I don't know how this works with ZFS sharenfs, though. >=20 I think it should be fine either way. (ZFS sharenfs is an alternate way to = set up ZFS exports, but I believe the result is just adding the entries to /etc/ex= ports.) If it doesn't work for some reason, just put lines in /etc/exports for the = ZFS volumes instead of using ZFS sharenfs. I recently had a report that "-S" would get stuck for a long time before performing an update of the exports when the server is under heavy load. I don't think this affects many people, but the attached 2-line patch (not yet in head) fixes the problem for the guy that reported it. rick > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >=20 ------=_Part_999238_496640285.1454452891098 Content-Type: text/x-patch; name=nfssuspend.patch Content-Disposition: attachment; filename=nfssuspend.patch Content-Transfer-Encoding: base64 LS0tIGZzL25mc3NlcnZlci9uZnNfbmZzZGtycGMuYy5zYXYyCTIwMTYtMDEtMTUgMTg6NDI6MTUu NDc5NzgzMDAwIC0wNTAwCisrKyBmcy9uZnNzZXJ2ZXIvbmZzX25mc2RrcnBjLmMJMjAxNi0wMS0x NSAxODo0NTo1OS40MTgyNDUwMDAgLTA1MDAKQEAgLTIzMSwxMCArMjMxLDE2IEBAIG5mc3N2Y19w cm9ncmFtKHN0cnVjdCBzdmNfcmVxICpycXN0LCBTVkMKIAkJICogR2V0IGEgcmVmY250IChzaGFy ZWQgbG9jaykgb24gbmZzZF9zdXNwZW5kX2xvY2suCiAJCSAqIE5GU1NWQ19TVVNQRU5ETkZTRCB3 aWxsIHRha2UgYW4gZXhjbHVzaXZlIGxvY2sgb24KIAkJICogbmZzZF9zdXNwZW5kX2xvY2sgdG8g c3VzcGVuZCB0aGVzZSB0aHJlYWRzLgorCQkgKiBUaGUgY2FsbCB0byBuZnN2NF9sb2NrKCkgdGhh dCBwcmVjZWVkcyBuZnN2NF9nZXRyZWYoKQorCQkgKiBlbnN1cmVzIHRoYXQgdGhlIGFjcXVpc2l0 aW9uIG9mIHRoZSBleGNsdXNpdmUgbG9jaworCQkgKiB0YWtlcyBwcmlvcml0eSBvdmVyIGFjcXVp c2l0aW9uIG9mIHRoZSBzaGFyZWQgbG9jayBieQorCQkgKiB3YWl0aW5nIGZvciBhbnkgZXhjbHVz aXZlIGxvY2sgcmVxdWVzdCB0byBjb21wbGV0ZS4KIAkJICogVGhpcyBtdXN0IGJlIGRvbmUgaGVy ZSwgYmVmb3JlIHRoZSBjaGVjayBvZgogCQkgKiBuZnN2NHJvb3QgZXhwb3J0cyBieSBuZnN2bm9f djRyb290ZXhwb3J0KCkuCiAJCSAqLwogCQlORlNMT0NLVjRST09UTVVURVgoKTsKKwkJbmZzdjRf bG9jaygmbmZzZF9zdXNwZW5kX2xvY2ssIDAsIE5VTEwsIE5GU1Y0Uk9PVExPQ0tNVVRFWFBUUiwK KwkJICAgIE5VTEwpOwogCQluZnN2NF9nZXRyZWYoJm5mc2Rfc3VzcGVuZF9sb2NrLCBOVUxMLCBO RlNWNFJPT1RMT0NLTVVURVhQVFIsCiAJCSAgICBOVUxMKTsKIAkJTkZTVU5MT0NLVjRST09UTVVU RVgoKTsK ------=_Part_999238_496640285.1454452891098--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1270648257.999240.1454452891099.JavaMail.zimbra>