From owner-freebsd-stable@freebsd.org Sat Mar 5 18:36:05 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 03F99A0AD33 for ; Sat, 5 Mar 2016 18:36:05 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id DF004A97 for ; Sat, 5 Mar 2016 18:36:04 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id DA15FA0AD31; Sat, 5 Mar 2016 18:36:04 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BFE1CA0AD30 for ; Sat, 5 Mar 2016 18:36:04 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6B133A96 for ; Sat, 5 Mar 2016 18:36:04 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u25IZx5L022737 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 5 Mar 2016 20:35:59 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u25IZx5L022737 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u25IZwn1022732; Sat, 5 Mar 2016 20:35:58 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 5 Mar 2016 20:35:58 +0200 From: Konstantin Belousov To: Dmitry Sivachenko Cc: Eugene Grosbein , FreeBSD Stable ML Subject: Re: nfs_getpages: error 4 Message-ID: <20160305183558.GB67250@kib.kiev.ua> References: <56DACD4E.3070905@grosbein.net> <550ADE4F-9F60-44FB-BF07-A1384A6B7B1A@gmail.com> <56DAE033.9020304@grosbein.net> <56DAE6D2.5040309@grosbein.net> <9BBCBDD0-4DAD-4189-9AAA-9FD94A458F7E@gmail.com> <20160305162747.GA67250@kib.kiev.ua> <415B1CAA-2728-48D0-96F0-C20F02F8A045@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <415B1CAA-2728-48D0-96F0-C20F02F8A045@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 18:36:05 -0000 On Sat, Mar 05, 2016 at 07:42:51PM +0300, Dmitry Sivachenko wrote: > > > On 05 Mar 2016, at 19:27, Konstantin Belousov wrote: > > > > On Sat, Mar 05, 2016 at 05:24:26PM +0300, Dmitry Sivachenko wrote: > >>> > >>> Again, error 4 is EINTR so you could disable both "soft" and "intr" options for test. > >> > >> > >> "soft" is meaningless in such setup, because "file system calls will fail after retrycnt round trip timeout intervals" but "The default is a retry count of zero, which means to keep retrying forever". > >> > >> If I understand "intr" correctly, it matters only when server becomes unresponsive, that is "server is not responding" message should be in my logs. But I have no such a message. > >> > >> > > > > The intr NFS mount option allows signals to interrupt NFS waits for the > > RPC responses. This is almost certainly the reason for the EINTR error > > you get from the pager. > > > > You should at last get the > > vm_fault: pager read error, pid ... > > messages as well. Is this true ? > > > That is true, see my initial post. Ok. > > > > The end result would be SIGSEGV > > delivered to the process. > > > > OTOH, I do not quite understand why did your threads requesting page-in > > fall into the wait for a free page. I assume that there is enough free > > pages in the system ? > > > > > I have no swap configured, but it is possible that running processes eat all RAM (I expect them to be killed with OOM rather than stuck?) I cannot answer this question about 'eat all ram'. You can. But I suspect that you do have enough free or reclamaible pages for OOM to not trigger, e.g. because you demonstrated commands output from the live system after the situation occured. It more likely was a temporal free page shortage, after which the system recovered. I more believe in a bug in the handling of killed process in vm_fault(). Could you get the p_flag value for the hung process ? Like ps -o flags