From owner-freebsd-bugs@FreeBSD.ORG  Thu Jul 20 15:00:51 2006
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
X-Original-To: freebsd-bugs@hub.freebsd.org
Delivered-To: freebsd-bugs@hub.freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 216CE16A4E1
	for <freebsd-bugs@hub.freebsd.org>;
	Thu, 20 Jul 2006 15:00:51 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8732A43D5F
	for <freebsd-bugs@hub.freebsd.org>;
	Thu, 20 Jul 2006 15:00:41 +0000 (GMT)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k6KF0em2006449
	for <freebsd-bugs@freefall.freebsd.org>; Thu, 20 Jul 2006 15:00:40 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k6KF0dSu006447;
	Thu, 20 Jul 2006 15:00:39 GMT (envelope-from gnats)
Date: Thu, 20 Jul 2006 15:00:39 GMT
Message-Id: <200607201500.k6KF0dSu006447@freefall.freebsd.org>
To: freebsd-bugs@FreeBSD.org
From: Yar Tikhiy <yar@comp.chem.msu.su>
Cc: 
Subject: Re: kern/87255: Large malloc-backed mfs crashes the system
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Yar Tikhiy <yar@comp.chem.msu.su>
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Jul 2006 15:00:51 -0000

The following reply was made to PR kern/87255; it has been noted by GNATS.

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Robert Watson <rwatson@FreeBSD.org>
Cc: freebsd-bugs@FreeBSD.org, bug-followup@FreeBSD.org
Subject: Re: kern/87255: Large malloc-backed mfs crashes the system
Date: Thu, 20 Jul 2006 18:52:34 +0400

 On Wed, Jul 05, 2006 at 12:16:11PM +0100, Robert Watson wrote:
 > On Wed, 26 Oct 2005, Yar Tikhiy wrote:
 > 
 > >> In all cases it is a "don't do that then" class of problem.
 > >
 > >Yes, of course.  The question is whether we consider it normal for root to 
 > >have ability to panic the system using standard tools. "cat /dev/zero > 
 > >/dev/mem" still is the ultimate way to.  IMHO it is a key issue whether we 
 > >fall back at the academical/research stage where rough corners are OK and 
 > >the system is just a toy for eggheads, or we pretend our system is stable 
 > >and robust.  I doubt if an admin can crash the Windows NT kernel from the 
 > >userland using conventional interfaces.  I by no means expect this issue 
 > >to be resolved soon, but it's worth being reflected on at tea-time :-)
 > >
 > >Apropos, here's another reproducible crash induced by md:
 > >
 > >	# mdconfig -a -t malloc -s 300m
 > >	md0
 > >	# dd if=/dev/urandom of=/dev/md0 bs=1
 > >	dd: /dev/md0: Input/output error
 > >	79+0 records in
 > >	78+9 records out
 > >	# reboot
 > >	panic: kmem_malloc(4096): kmem_map too small: 86224896 total 
 > >	allocated
 > >
 > >Apparently, it is not a fault of md, just our kernel memory allocator 
 > >allows other kernel parts to starve it to death.
 > 
 > I'm not sure I entirely go along with this interpretation.  The answer to 
 > the question "What do do when the kernel runs out of address space?" is not 
 > easily found.  The "problem" is that md performs potentially unbounded 
 > allocation of a quite bounded resource -- remember that resource deadlocks 
 > are very real, sometimes it takes memory to release memory (abstractly, 
 > think of memory allocation as locking).  UMA supports allocator-enforced 
 > resource limits, which can be requested by the consumer using 
 > uma_zone_set_max().  md(4) should probably be using that interface and 
 > requesting a resource limit.
 
 The panic doesn't seem to be on a critical path in the kernel; it's
 in kmem_malloc(), which is essentially a utility routine.  Could
 the allocation attempt just fail for the caller to decide what to
 do then?  In fact, it can fail, but only in case of M_NOWAIT:
 
         if (vm_map_findspace(map, vm_map_min(map), size, &addr)) {
                 vm_map_unlock(map);
                 if ((flags & M_NOWAIT) == 0)
                         panic("kmem_malloc(%ld): kmem_map too small: %ld total allocated",
                                 (long)size, (long)map->size);
                 return (0);
         }
 
 Looks like we have to panic there merely because malloc(9) is
 promised to succeed if waiting is OK, but there's no chance for
 success.  Isn't it a design issue?
 
 > There is also a problem then regarding what happens when md(4) runs out of 
 > resources to allocate when it has already "promised" that it's a disk of a 
 > certain size up the stack.  I.e., if the result isn't a panic, then how 
 > will md(4) handle failure?  Most file systems will not be happy when they 
 > get EIO, so then perhaps the problem is that md(4) provides an abstraction 
 > for a non-sparse device up the storage stack, but is in fact 
 > over-committing.  This suggests either that the size of an md device should 
 > be strictly bounded if it is malloc-backed.  Picking that maximum bound is 
 > also tricky.  This is why, in practice, we recommend using swap-backed md 
 > devices, so that the pages associated with the md device can be swapped out 
 > under memory pressure, and that the swap system have enough memory to fully 
 > back the md device.
 
 Perhaps md(4) shouldn't over-commit in malloc mode?  It will waste
 precious physical memory, but malloc mode is supposed to.  And one
 can't use swap-backed md when diskless.
 
 -- 
 Yar