From owner-freebsd-stable@freebsd.org Sat Mar 5 13:42:49 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D74AEA0A6A3 for ; Sat, 5 Mar 2016 13:42:49 +0000 (UTC) (envelope-from trtrmitya@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B73A7A98 for ; Sat, 5 Mar 2016 13:42:49 +0000 (UTC) (envelope-from trtrmitya@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id B64B4A0A69F; Sat, 5 Mar 2016 13:42:49 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9BFA4A0A69D for ; Sat, 5 Mar 2016 13:42:49 +0000 (UTC) (envelope-from trtrmitya@gmail.com) Received: from mail-lb0-x234.google.com (mail-lb0-x234.google.com [IPv6:2a00:1450:4010:c04::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 22CD1A96 for ; Sat, 5 Mar 2016 13:42:49 +0000 (UTC) (envelope-from trtrmitya@gmail.com) Received: by mail-lb0-x234.google.com with SMTP id cf7so72619427lbb.1 for ; Sat, 05 Mar 2016 05:42:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ol4GgdeaLbfcJQVrBbGSnYTNDV7V3PWtiYFhNcHORb4=; b=hliSBeyIaWM2dpz+JUhA3Z7zyZg0R4ZqHKG5x+tJEebZ1ovM4h1XnOI3jeP1Y+CaEs 3dSXhus+98YrseecE8fQ1ZDcT/kEcWQeJNNToYEdfYDV4+FPMpvMqJw/hMDnRwJDHlhN e0GkCD60WxZx2ricpGdkjuPdBhaiQrFTVDln80P6P6Q4rCeqqECXqyLIHshjqs5BM+LK eJsJbtp5DCn2fFLP/A1TDu4iSGFB0hlrOSliabgoIeWU+Kyb/M3T9BXe9CKQx3YI5UlU CFIwezu1zrTK+sNaMi95zceQpAzux2EiOUbxw53uVPFxrixWBpdUK+Qbqr7BNVtpvhcV cfZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ol4GgdeaLbfcJQVrBbGSnYTNDV7V3PWtiYFhNcHORb4=; b=UE4mMKPl8FlmdYW8G1XeKQrq8EEpQGi10xKhheilHgjhDEucluadsdHZvEEeEZK2+J 7fsAdbIP75vx5KSkV7ctcuahNgndlecWnFnwy+BNd42ah2U0E6oCAdrlxluNDcxUtDU5 h5HV01WCyQqLy0hgkkEMg77lh3FpZyTFXSlm55nwCk1yE9sQoQIA1oR+QKZuEo+jJpG7 KiG9Sitbot+Ek90o44D/FUVTqxMoSYhCwe3AQ7HTgCataStp8MLnKLp5FWq2C6o4YtSL aliHZa16a/lFWlIZI6j2SsfKALwopi78zgFRZgk8xswSZJUvYWIxW+tZ9I8qKbFIxULX 2F3Q== X-Gm-Message-State: AD7BkJJq1ecQWBxmkAt1MyS4a2a7wdhTtTSTLyKr+EN0LaZWr4bDbLerGJuhm/Z0c0wvxA== X-Received: by 10.25.136.139 with SMTP id k133mr4812563lfd.157.1457185367056; Sat, 05 Mar 2016 05:42:47 -0800 (PST) Received: from [10.0.1.4] (broadband-5-228-251-240.nationalcablenetworks.ru. [5.228.251.240]) by smtp.gmail.com with ESMTPSA id o7sm1322274lfb.15.2016.03.05.05.42.45 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 05 Mar 2016 05:42:45 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) Subject: Re: nfs_getpages: error 4 From: Dmitry Sivachenko In-Reply-To: <56DAE033.9020304@grosbein.net> Date: Sat, 5 Mar 2016 16:42:45 +0300 Cc: FreeBSD Stable ML Content-Transfer-Encoding: quoted-printable Message-Id: References: <56DACD4E.3070905@grosbein.net> <550ADE4F-9F60-44FB-BF07-A1384A6B7B1A@gmail.com> <56DAE033.9020304@grosbein.net> To: Eugene Grosbein X-Mailer: Apple Mail (2.3112) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 13:42:49 -0000 > On 05 Mar 2016, at 16:33, Eugene Grosbein wrote: >=20 > 05.03.2016 19:32, Dmitry Sivachenko =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >=20 >>>> I am running a number of machines with /home mounted via nfs = (FreeBSD 10.3-PRERELEASE #0 r294799, rw,bg,intr,soft). >>>>=20 >>>> Sometimes I get the following messages in syslog: >>>>=20 >>>> nfs_getpages: error 4 >>>> vm_fault: pager read error, pid NNN (myprog) >>>>=20 >>>> After that I see I lot of processes stuck in "pfault" state (these = are computational processes which use some files from NFS mount), they = use 0% of CPU after that. >>>>=20 >>>> On NFS server machine I see nothing strange in logs. procstat -kk = for such stuck processes shows: >>>> PID TID COMM TDNAME KSTACK >>>> 85274 102056 myprog - mi_switch+0xbe = sleepq_wait+0x3a _sleep+0x287 vm_waitpfault+0x8a vm_fault_hold+0xdd0 = vm_fault+0x77 trap_pfault+0x180 trap+0x52c calltrap+0x8 >>>>=20 >>>>=20 >>>> What can be the reason of this? >>>=20 >>> For example, if some processes running on NFS server box modify some = files "in-place" >>> and these files are opened by processes running on NFS client, that = could be the reason. >>> If so, change this so processes updating such files create new = temporary versions of them first >>> and then rename them atomically. >>>=20 >>=20 >> This should not be the case: users are working only on NFS clients. >> Moreover, the nature of computations is so that each process uses = it's own set of files. >>=20 >> (Forgot to mention in my previous e-mail that these processes can't = be stopped even with kill -9) >=20 > Make sure you use TCP mounts and TSO is disabled. I do use TCP mount (this is the default). I will try to disable TSO. > Try switching between NFSv3/NFSv4 to avoid this bug As far as I understand, the default is NFSv3 (which should be more = stable?). I can try to switch to NFSv4. > and to discover what version is broken. And show full mount = command/option set. I already included mount flags from fstab in my original e-mail: rw,bg,intr,soft