From owner-freebsd-bugs@FreeBSD.ORG Wed Jul 5 11:16:12 2006 Return-Path: X-Original-To: freebsd-bugs@FreeBSD.org Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3D83816A4DD; Wed, 5 Jul 2006 11:16:12 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id DDE8643D45; Wed, 5 Jul 2006 11:16:11 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 7F36B46C23; Wed, 5 Jul 2006 07:16:11 -0400 (EDT) Date: Wed, 5 Jul 2006 12:16:11 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Yar Tikhiy In-Reply-To: <200510260910.j9Q9AKtg075166@freefall.freebsd.org> Message-ID: <20060705120908.Q18236@fledge.watson.org> References: <200510260910.j9Q9AKtg075166@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-bugs@FreeBSD.org, bug-followup@FreeBSD.org Subject: Re: kern/87255: Large malloc-backed mfs crashes the system X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2006 11:16:12 -0000 On Wed, 26 Oct 2005, Yar Tikhiy wrote: > > In all cases it is a "don't do that then" class of problem. > > Yes, of course. The question is whether we consider it normal for root to > have ability to panic the system using standard tools. "cat /dev/zero > > /dev/mem" still is the ultimate way to. IMHO it is a key issue whether we > fall back at the academical/research stage where rough corners are OK and > the system is just a toy for eggheads, or we pretend our system is stable > and robust. I doubt if an admin can crash the Windows NT kernel from the > userland using conventional interfaces. I by no means expect this issue to > be resolved soon, but it's worth being reflected on at tea-time :-) > > Apropos, here's another reproducible crash induced by md: > > # mdconfig -a -t malloc -s 300m > md0 > # dd if=/dev/urandom of=/dev/md0 bs=1 > dd: /dev/md0: Input/output error > 79+0 records in > 78+9 records out > # reboot > panic: kmem_malloc(4096): kmem_map too small: 86224896 total allocated > > Apparently, it is not a fault of md, just our kernel memory allocator allows > other kernel parts to starve it to death. I'm not sure I entirely go along with this interpretation. The answer to the question "What do do when the kernel runs out of address space?" is not easily found. The "problem" is that md performs potentially unbounded allocation of a quite bounded resource -- remember that resource deadlocks are very real, sometimes it takes memory to release memory (abstractly, think of memory allocation as locking). UMA supports allocator-enforced resource limits, which can be requested by the consumer using uma_zone_set_max(). md(4) should probably be using that interface and requesting a resource limit. There is also a problem then regarding what happens when md(4) runs out of resources to allocate when it has already "promised" that it's a disk of a certain size up the stack. I.e., if the result isn't a panic, then how will md(4) handle failure? Most file systems will not be happy when they get EIO, so then perhaps the problem is that md(4) provides an abstraction for a non-sparse device up the storage stack, but is in fact over-committing. This suggests either that the size of an md device should be strictly bounded if it is malloc-backed. Picking that maximum bound is also tricky. This is why, in practice, we recommend using swap-backed md devices, so that the pages associated with the md device can be swapped out under memory pressure, and that the swap system have enough memory to fully back the md device. Robert N M Watson Computer Laboratory University of Cambridge