Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Feb 2013 21:08:35 +1300
From:      Andrew Turner <andrew@fubar.geek.nz>
To:        Alan Cox <alc@rice.edu>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin <jhb@FreeBSD.org>
Subject:   Re: svn commit: r247116 - in head/sys: fs/nfs fs/nfsclient kern nfsclient sys tools
Message-ID:  <20130226210835.749cd816@bender>
In-Reply-To: <512C6916.5010608@rice.edu>
References:  <201302211902.r1LJ2o5T033708@svn.freebsd.org> <20130225201313.2050da18@bender> <20130225085019.GU2454@kib.kiev.ua> <20130225233603.49a5d4a5@bender> <8006325C-B281-4F4D-BE1A-C3B444FE979F@rice.edu> <20130226202707.026ad226@bender> <512C6916.5010608@rice.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 26 Feb 2013 01:49:42 -0600
Alan Cox <alc@rice.edu> wrote:

> On 02/26/2013 01:27, Andrew Turner wrote:
> > On Mon, 25 Feb 2013 15:00:41 -0600
> > Alan Cox <alc@rice.edu> wrote:
> >
> >> On Feb 25, 2013, at 4:36 AM, Andrew Turner wrote:
> >>
> >>> On Mon, 25 Feb 2013 10:50:19 +0200
> >>> Konstantin Belousov <kostikbel@gmail.com> wrote:
> >>>
> >>>> On Mon, Feb 25, 2013 at 08:13:13PM +1300, Andrew Turner wrote:
> >>>>> On Thu, 21 Feb 2013 19:02:50 +0000 (UTC)
> >>>>> John Baldwin <jhb@FreeBSD.org> wrote:
> >>>>>
> >>>>>> Author: jhb
> >>>>>> Date: Thu Feb 21 19:02:50 2013
> >>>>>> New Revision: 247116
> >>>>>> URL: http://svnweb.freebsd.org/changeset/base/247116
> >>>>>>
> >>>>>> Log:
> >>>>>>  Further refine the handling of stop signals in the NFS client.
> >>>>>> The changes in r246417 were incomplete as they did not add
> >>>>>> explicit calls to sigdeferstop() around all the places that
> >>>>>> previously passed SBDRY to _sleep().  In addition,
> >>>>>> nfs_getcacheblk() could trigger a write RPC from getblk()
> >>>>>> resulting in sigdeferstop() recursing. Rather than manually
> >>>>>> deferring stop signals in specific places, change the VFS_*()
> >>>>>> and VOP_*() methods to defer stop signals for filesystems which
> >>>>>> request this behavior via a new VFCF_SBDRY flag. Note that this
> >>>>>> has to be a VFC flag rather than a MNTK flag so that it works
> >>>>>> properly with VFS_MOUNT() when the mount is not yet fully
> >>>>>> constructed.  For now, only the NFS clients are set this new
> >>>>>> flag in VFS_SET(). A few other related changes:
> >>>>>>  - Add an assertion to ensure that TDF_SBDRY doesn't leak to
> >>>>>> userland.
> >>>>>>  - When a lookup request uses VOP_READLINK() to follow a
> >>>>>> symlink, mark the request as being on behalf of the thread
> >>>>>> performing the lookup (cnp_thread) rather than using a NULL
> >>>>>> thread pointer. This causes NFS to properly handle signals
> >>>>>> during this VOP on an interruptible mount.
> >>>>>>
> >>>>>>  PR:		kern/176179
> >>>>>>  Reported by:	Russell Cattelan (sigdeferstop()
> >>>>>> recursion) Reviewed by:	kib
> >>>>>>  MFC after:	1 month
> >>>>> This change is causing init to crash for me on armv6. I'm
> >>>>> netbooting a PandaBoard and it appears init is receiving a
> >>>>> SIGABRT before it gets into main().
> >>>>>
> >>>>> Do you have any idea where I could look to track down why it is
> >>>>> doing this?
> >>>> It is weird. SIGABRT sent by the kernel usually means that
> >>>> execve(2) already destroyed the previous address space of the
> >>>> process, but the new image cannot be activated, most likely due
> >>>> to image format error discovered too late, or resource shortage.
> >>>>
> >>>> Could it be that some NFS RPC fails after the patch, but I cannot
> >>>> imagine why. You would need to track this. Also, verify that the
> >>>> init binary is correct.
> >>>>
> >>>> I tried amd64 netboot, and it worked fine.
> >>> It looks like this change is not the issue, it just changed the
> >>> symptom enough for me to not realise I was seeing an issue where
> >>> it would crash the kernel before. I reinstated this change but
> >>> only allowed the kernel to access half the memory and it booted
> >>> correctly.
> >>>
> >>> The real issue appears to be related to something in the vm layer
> >>> not working on ARM boards with too much memory (somewhere between
> >>> 512MiB and 1GiB).
> >>
> >> The recently introduced auto-sizing and cap may be too optimistic.
> >> In fact, they are greater than what we allow on 32-bit x86 and
> >> 32-bit MIPS.  Try the following.
> >>
> >> Index: arm/include/vmparam.h
> >> ===================================================================
> >> --- arm/include/vmparam.h	(revision 247249)
> >> +++ arm/include/vmparam.h	(working copy)
> >> @@ -142,15 +142,15 @@
> >>  #define VM_KMEM_SIZE		(12*1024*1024)
> >>  #endif
> >>  #ifndef VM_KMEM_SIZE_SCALE
> >> -#define VM_KMEM_SIZE_SCALE	(2)
> >> +#define VM_KMEM_SIZE_SCALE	(3)
> >>  #endif
> >>  
> >>  /*
> >> - * Ceiling on the size of the kmem submap: 60% of the kernel map.
> >> + * Ceiling on the size of the kmem submap: 40% of the kernel map.
> >>   */
> >>  #ifndef VM_KMEM_SIZE_MAX
> >>  #define	VM_KMEM_SIZE_MAX	((vm_max_kernel_address - \
> >> -    VM_MIN_KERNEL_ADDRESS + 1) * 3 / 5)
> >> +    VM_MIN_KERNEL_ADDRESS + 1) * 2 / 5)
> >>  #endif
> >>  
> >>  #define MAXTSIZ 	(16*1024*1024)
> >>
> > This patch fixes the boot for me. Is it likely we will see similar
> > issues with boards with more memory with this? I know of ARM boards
> > with 2GiB of ram, and I would expect to see some with more soon.
> >
> 
> The kmem submap should be fine, but other things might become a
> problem.
> 
> What do "sysctl -x vm.min_kernel_address" and "sysctl -x
> vm.max_kernel_address" report on your machine?

I get the following.

# sysctl -x vm.min_kernel_address
vm.min_kernel_address: 0xc0000000
# sysctl -x vm.max_kernel_address
vm.max_kernel_address: 0xdf000000

Andrew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130226210835.749cd816>