From owner-svn-src-head@FreeBSD.ORG Tue Feb 26 07:27:31 2013 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8DBB776B; Tue, 26 Feb 2013 07:27:31 +0000 (UTC) (envelope-from andrew@fubar.geek.nz) Received: from smtp3.clear.net.nz (smtp3.clear.net.nz [203.97.33.64]) by mx1.freebsd.org (Postfix) with ESMTP id 595D320D; Tue, 26 Feb 2013 07:27:31 +0000 (UTC) Received: from mxin1-orange.clear.net.nz (lb2-srcnat.clear.net.nz [203.97.32.237]) by smtp3.clear.net.nz (CLEAR Net Mail) with ESMTP id <0MIT0069RGPTX700@smtp3.clear.net.nz>; Tue, 26 Feb 2013 20:27:30 +1300 (NZDT) Received: from 202-0-48-19.paradise.net.nz (HELO bender) ([202.0.48.19]) by smtpin1.paradise.net.nz with ESMTP; Tue, 26 Feb 2013 20:27:30 +1300 Date: Tue, 26 Feb 2013 20:27:07 +1300 From: Andrew Turner Subject: Re: svn commit: r247116 - in head/sys: fs/nfs fs/nfsclient kern nfsclient sys tools In-reply-to: <8006325C-B281-4F4D-BE1A-C3B444FE979F@rice.edu> To: Alan Cox Message-id: <20130226202707.026ad226@bender> MIME-version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7bit References: <201302211902.r1LJ2o5T033708@svn.freebsd.org> <20130225201313.2050da18@bender> <20130225085019.GU2454@kib.kiev.ua> <20130225233603.49a5d4a5@bender> <8006325C-B281-4F4D-BE1A-C3B444FE979F@rice.edu> Cc: Konstantin Belousov , svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 07:27:31 -0000 On Mon, 25 Feb 2013 15:00:41 -0600 Alan Cox wrote: > > On Feb 25, 2013, at 4:36 AM, Andrew Turner wrote: > > > On Mon, 25 Feb 2013 10:50:19 +0200 > > Konstantin Belousov wrote: > > > >> On Mon, Feb 25, 2013 at 08:13:13PM +1300, Andrew Turner wrote: > >>> On Thu, 21 Feb 2013 19:02:50 +0000 (UTC) > >>> John Baldwin wrote: > >>> > >>>> Author: jhb > >>>> Date: Thu Feb 21 19:02:50 2013 > >>>> New Revision: 247116 > >>>> URL: http://svnweb.freebsd.org/changeset/base/247116 > >>>> > >>>> Log: > >>>> Further refine the handling of stop signals in the NFS client. > >>>> The changes in r246417 were incomplete as they did not add > >>>> explicit calls to sigdeferstop() around all the places that > >>>> previously passed SBDRY to _sleep(). In addition, > >>>> nfs_getcacheblk() could trigger a write RPC from getblk() > >>>> resulting in sigdeferstop() recursing. Rather than manually > >>>> deferring stop signals in specific places, change the VFS_*() and > >>>> VOP_*() methods to defer stop signals for filesystems which > >>>> request this behavior via a new VFCF_SBDRY flag. Note that this > >>>> has to be a VFC flag rather than a MNTK flag so that it works > >>>> properly with VFS_MOUNT() when the mount is not yet fully > >>>> constructed. For now, only the NFS clients are set this new flag > >>>> in VFS_SET(). A few other related changes: > >>>> - Add an assertion to ensure that TDF_SBDRY doesn't leak to > >>>> userland. > >>>> - When a lookup request uses VOP_READLINK() to follow a symlink, > >>>> mark the request as being on behalf of the thread performing the > >>>> lookup (cnp_thread) rather than using a NULL thread pointer. > >>>> This causes NFS to properly handle signals during this VOP on an > >>>> interruptible mount. > >>>> > >>>> PR: kern/176179 > >>>> Reported by: Russell Cattelan (sigdeferstop() recursion) > >>>> Reviewed by: kib > >>>> MFC after: 1 month > >>> > >>> This change is causing init to crash for me on armv6. I'm > >>> netbooting a PandaBoard and it appears init is receiving a SIGABRT > >>> before it gets into main(). > >>> > >>> Do you have any idea where I could look to track down why it is > >>> doing this? > >> > >> It is weird. SIGABRT sent by the kernel usually means that > >> execve(2) already destroyed the previous address space of the > >> process, but the new image cannot be activated, most likely due to > >> image format error discovered too late, or resource shortage. > >> > >> Could it be that some NFS RPC fails after the patch, but I cannot > >> imagine why. You would need to track this. Also, verify that the > >> init binary is correct. > >> > >> I tried amd64 netboot, and it worked fine. > > > > It looks like this change is not the issue, it just changed the > > symptom enough for me to not realise I was seeing an issue where > > it would crash the kernel before. I reinstated this change but only > > allowed the kernel to access half the memory and it booted > > correctly. > > > > The real issue appears to be related to something in the vm layer > > not working on ARM boards with too much memory (somewhere between > > 512MiB and 1GiB). > > > The recently introduced auto-sizing and cap may be too optimistic. > In fact, they are greater than what we allow on 32-bit x86 and 32-bit > MIPS. Try the following. > > Index: arm/include/vmparam.h > =================================================================== > --- arm/include/vmparam.h (revision 247249) > +++ arm/include/vmparam.h (working copy) > @@ -142,15 +142,15 @@ > #define VM_KMEM_SIZE (12*1024*1024) > #endif > #ifndef VM_KMEM_SIZE_SCALE > -#define VM_KMEM_SIZE_SCALE (2) > +#define VM_KMEM_SIZE_SCALE (3) > #endif > > /* > - * Ceiling on the size of the kmem submap: 60% of the kernel map. > + * Ceiling on the size of the kmem submap: 40% of the kernel map. > */ > #ifndef VM_KMEM_SIZE_MAX > #define VM_KMEM_SIZE_MAX ((vm_max_kernel_address - \ > - VM_MIN_KERNEL_ADDRESS + 1) * 3 / 5) > + VM_MIN_KERNEL_ADDRESS + 1) * 2 / 5) > #endif > > #define MAXTSIZ (16*1024*1024) > This patch fixes the boot for me. Is it likely we will see similar issues with boards with more memory with this? I know of ARM boards with 2GiB of ram, and I would expect to see some with more soon. Andrew