From owner-freebsd-questions@freebsd.org Tue Feb 2 14:53:36 2016 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 79691A9813C for ; Tue, 2 Feb 2016 14:53:36 +0000 (UTC) (envelope-from wam@hiwaay.net) Received: from fly.hiwaay.net (fly.hiwaay.net [216.180.54.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48941119 for ; Tue, 2 Feb 2016 14:53:35 +0000 (UTC) (envelope-from wam@hiwaay.net) Received: from kabini1.local (dynamic-216-186-244-25.knology.net [216.186.244.25] (may be forged)) (authenticated bits=0) by fly.hiwaay.net (8.13.8/8.13.8/fly) with ESMTP id u12ErWJC022194 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Tue, 2 Feb 2016 08:53:33 -0600 Subject: Re: NFS unstable with high load on server To: freebsd-questions@freebsd.org References: From: "William A. Mahaffey III" Message-ID: <56B0C2EC.2090706@hiwaay.net> Date: Tue, 2 Feb 2016 08:59:02 -0553.75 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 14:53:36 -0000 On 02/01/16 08:32, Vick Khera wrote: > I have a handful of servers at my data center all running FreeBSD 10.2. On > one of them I have a copy of the FreeBSD sources shared via NFS. When this > server is running a large poudriere run re-building all the ports I need, > the clients' NFS mounts become unstable. That is, the clients keep getting > read failures. The interactive performance of the NFS server is just fine, > however. The local file system is a ZFS mirror. > > What could be causing NFS to be unstable in this situation? > > Specifics: > > Server "lorax" FreeBSD 10.2-RELEASE-p7 kernel locally compiled, with NFS > server and ZFS as dynamic kernel modules. 16GB RAM, Xeon 3.1GHz quad > processor. > > The directory /u/lorax1 a ZFS dataset on a mirrored pool, and is NFS > exported via the ZFS exports file. I put the FreeBSD sources on this > dataset and symlink to /usr/src. > > > Client "bluefish" FreeBSD 10.2-RELEASE-p5 kernel locally compiled, NFS > client built in to kernel. 32GB RAM, Xeon 3.1GHz quad processor (basically > same hardware but more RAM). > > The directory /n/lorax1 is NFS mounted from lorax via autofs. The NFS > options are "intr,nolockd". /usr/src is symlinked to the sources in that > NFS mount. > > > What I observe: > > [lorax]~% cd /usr/src > [lorax]src% svn status > [lorax]src% w > 9:12AM up 12 days, 19:19, 4 users, load averages: 4.43, 4.45, 3.61 > USER TTY FROM LOGIN@ IDLE WHAT > vivek pts/0 vick.int.kcilink.com 8:44AM - tmux: client > (/tmp/ > vivek pts/1 tmux(19747).%0 8:44AM 19 sed > y%*+%pp%;s%[^_a > vivek pts/2 tmux(19747).%1 8:56AM - w > vivek pts/3 tmux(19747).%2 8:56AM - slogin > bluefish-prv > [lorax]src% pwd > /u/lorax1/usr10/src > > So right now the load average is more than 1 per processor on lorax. I can > quite easily run "svn status" on the source directory, and the interactive > performance is pretty snappy for editing local files and navigating around > the file system. > > > On the client: > > [bluefish]~% cd /usr/src > [bluefish]src% pwd > /n/lorax1/usr10/src > [bluefish]src% svn status > svn: E070008: Can't read directory '/n/lorax1/usr10/src/contrib/sqlite3': > Partial results are valid but processing is incomplete > [bluefish]src% svn status > svn: E070008: Can't read directory '/n/lorax1/usr10/src/lib/libfetch': > Partial results are valid but processing is incomplete > [bluefish]src% svn status > svn: E070008: Can't read directory > '/n/lorax1/usr10/src/release/picobsd/tinyware/msg': Partial results are > valid but processing is incomplete > [bluefish]src% w > 9:14AM up 93 days, 23:55, 1 user, load averages: 0.10, 0.15, 0.15 > USER TTY FROM LOGIN@ IDLE WHAT > vivek pts/0 lorax-prv.kcilink.com 8:56AM - w > [bluefish]src% df . > Filesystem 1K-blocks Used Avail Capacity Mounted on > lorax-prv:/u/lorax1 932845181 6090910 926754271 1% /n/lorax1 > > > What I see is more or less random failures to read the NFS volume. When the > server is not so busy running poudriere builds, the client never has any > failures. > > I also observe this kind of failure doing buildworld or installworld on > the client when the server is busy -- I get strange random failures reading > the files causing the build or install to fail. > > My workaround is to not do build/installs on client machines when the NFS > server is busy doing large jobs like building all packages, but there is > definitely something wrong here I'd like to fix. I observe this on all the > local NFS clients. I rebooted the server before to try to clear this up but > it did not fix it. > > Any help would be appreciated. > _______________________________________________ > freebsd-questions@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" > I notice similar issues. I have some in-house code that I native-recompile nightly under a FreeBSD 9.3R box running a 8-HDD-unmirrored-pool-ZFS & across the LAN on a Linux box w/ the Intel compiler suite. I also do across-the-LAN nightly backups. The across-the-LAN compiles & backups both use NFS to access other boxen on the LAN, notably including the dev-box doing the native-compile under FreeBSD 9.3R. If these processes overlap (too much/at all), the native compiles often fail and/or the backups barf. I thought it might be a NFS/ZFS issue. Earlier boxen didn't use ZFS, they were Linux or SGI (snif), & they bogged down (mightily) if the processes overlapped, but they did finish cleanly. -- William A. Mahaffey III ---------------------------------------------------------------------- "The M1 Garand is without doubt the finest implement of war ever devised by man." -- Gen. George S. Patton Jr.