From owner-svn-src-all@FreeBSD.ORG Tue Feb 26 08:09:52 2013 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 59359348; Tue, 26 Feb 2013 08:09:52 +0000 (UTC) (envelope-from andrew@fubar.geek.nz) Received: from smtp3.clear.net.nz (smtp3.clear.net.nz [203.97.33.64]) by mx1.freebsd.org (Postfix) with ESMTP id 0293F373; Tue, 26 Feb 2013 08:09:51 +0000 (UTC) Received: from mxin2-orange.clear.net.nz (lb2-srcnat.clear.net.nz [203.97.32.237]) by smtp3.clear.net.nz (CLEAR Net Mail) with ESMTP id <0MIT0061FIMVX610@smtp3.clear.net.nz>; Tue, 26 Feb 2013 21:08:58 +1300 (NZDT) Received: from 202-0-48-19.paradise.net.nz (HELO bender) ([202.0.48.19]) by smtpin2.paradise.net.nz with ESMTP; Tue, 26 Feb 2013 21:08:57 +1300 Date: Tue, 26 Feb 2013 21:08:35 +1300 From: Andrew Turner Subject: Re: svn commit: r247116 - in head/sys: fs/nfs fs/nfsclient kern nfsclient sys tools In-reply-to: <512C6916.5010608@rice.edu> To: Alan Cox Message-id: <20130226210835.749cd816@bender> MIME-version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7bit References: <201302211902.r1LJ2o5T033708@svn.freebsd.org> <20130225201313.2050da18@bender> <20130225085019.GU2454@kib.kiev.ua> <20130225233603.49a5d4a5@bender> <8006325C-B281-4F4D-BE1A-C3B444FE979F@rice.edu> <20130226202707.026ad226@bender> <512C6916.5010608@rice.edu> Cc: Konstantin Belousov , svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 08:09:52 -0000 On Tue, 26 Feb 2013 01:49:42 -0600 Alan Cox wrote: > On 02/26/2013 01:27, Andrew Turner wrote: > > On Mon, 25 Feb 2013 15:00:41 -0600 > > Alan Cox wrote: > > > >> On Feb 25, 2013, at 4:36 AM, Andrew Turner wrote: > >> > >>> On Mon, 25 Feb 2013 10:50:19 +0200 > >>> Konstantin Belousov wrote: > >>> > >>>> On Mon, Feb 25, 2013 at 08:13:13PM +1300, Andrew Turner wrote: > >>>>> On Thu, 21 Feb 2013 19:02:50 +0000 (UTC) > >>>>> John Baldwin wrote: > >>>>> > >>>>>> Author: jhb > >>>>>> Date: Thu Feb 21 19:02:50 2013 > >>>>>> New Revision: 247116 > >>>>>> URL: http://svnweb.freebsd.org/changeset/base/247116 > >>>>>> > >>>>>> Log: > >>>>>> Further refine the handling of stop signals in the NFS client. > >>>>>> The changes in r246417 were incomplete as they did not add > >>>>>> explicit calls to sigdeferstop() around all the places that > >>>>>> previously passed SBDRY to _sleep(). In addition, > >>>>>> nfs_getcacheblk() could trigger a write RPC from getblk() > >>>>>> resulting in sigdeferstop() recursing. Rather than manually > >>>>>> deferring stop signals in specific places, change the VFS_*() > >>>>>> and VOP_*() methods to defer stop signals for filesystems which > >>>>>> request this behavior via a new VFCF_SBDRY flag. Note that this > >>>>>> has to be a VFC flag rather than a MNTK flag so that it works > >>>>>> properly with VFS_MOUNT() when the mount is not yet fully > >>>>>> constructed. For now, only the NFS clients are set this new > >>>>>> flag in VFS_SET(). A few other related changes: > >>>>>> - Add an assertion to ensure that TDF_SBDRY doesn't leak to > >>>>>> userland. > >>>>>> - When a lookup request uses VOP_READLINK() to follow a > >>>>>> symlink, mark the request as being on behalf of the thread > >>>>>> performing the lookup (cnp_thread) rather than using a NULL > >>>>>> thread pointer. This causes NFS to properly handle signals > >>>>>> during this VOP on an interruptible mount. > >>>>>> > >>>>>> PR: kern/176179 > >>>>>> Reported by: Russell Cattelan (sigdeferstop() > >>>>>> recursion) Reviewed by: kib > >>>>>> MFC after: 1 month > >>>>> This change is causing init to crash for me on armv6. I'm > >>>>> netbooting a PandaBoard and it appears init is receiving a > >>>>> SIGABRT before it gets into main(). > >>>>> > >>>>> Do you have any idea where I could look to track down why it is > >>>>> doing this? > >>>> It is weird. SIGABRT sent by the kernel usually means that > >>>> execve(2) already destroyed the previous address space of the > >>>> process, but the new image cannot be activated, most likely due > >>>> to image format error discovered too late, or resource shortage. > >>>> > >>>> Could it be that some NFS RPC fails after the patch, but I cannot > >>>> imagine why. You would need to track this. Also, verify that the > >>>> init binary is correct. > >>>> > >>>> I tried amd64 netboot, and it worked fine. > >>> It looks like this change is not the issue, it just changed the > >>> symptom enough for me to not realise I was seeing an issue where > >>> it would crash the kernel before. I reinstated this change but > >>> only allowed the kernel to access half the memory and it booted > >>> correctly. > >>> > >>> The real issue appears to be related to something in the vm layer > >>> not working on ARM boards with too much memory (somewhere between > >>> 512MiB and 1GiB). > >> > >> The recently introduced auto-sizing and cap may be too optimistic. > >> In fact, they are greater than what we allow on 32-bit x86 and > >> 32-bit MIPS. Try the following. > >> > >> Index: arm/include/vmparam.h > >> =================================================================== > >> --- arm/include/vmparam.h (revision 247249) > >> +++ arm/include/vmparam.h (working copy) > >> @@ -142,15 +142,15 @@ > >> #define VM_KMEM_SIZE (12*1024*1024) > >> #endif > >> #ifndef VM_KMEM_SIZE_SCALE > >> -#define VM_KMEM_SIZE_SCALE (2) > >> +#define VM_KMEM_SIZE_SCALE (3) > >> #endif > >> > >> /* > >> - * Ceiling on the size of the kmem submap: 60% of the kernel map. > >> + * Ceiling on the size of the kmem submap: 40% of the kernel map. > >> */ > >> #ifndef VM_KMEM_SIZE_MAX > >> #define VM_KMEM_SIZE_MAX ((vm_max_kernel_address - \ > >> - VM_MIN_KERNEL_ADDRESS + 1) * 3 / 5) > >> + VM_MIN_KERNEL_ADDRESS + 1) * 2 / 5) > >> #endif > >> > >> #define MAXTSIZ (16*1024*1024) > >> > > This patch fixes the boot for me. Is it likely we will see similar > > issues with boards with more memory with this? I know of ARM boards > > with 2GiB of ram, and I would expect to see some with more soon. > > > > The kmem submap should be fine, but other things might become a > problem. > > What do "sysctl -x vm.min_kernel_address" and "sysctl -x > vm.max_kernel_address" report on your machine? I get the following. # sysctl -x vm.min_kernel_address vm.min_kernel_address: 0xc0000000 # sysctl -x vm.max_kernel_address vm.max_kernel_address: 0xdf000000 Andrew