From owner-freebsd-stable@FreeBSD.ORG Fri Aug 27 20:42:19 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5939116A4CE for ; Fri, 27 Aug 2004 20:42:19 +0000 (GMT) Received: from electra.cse.Buffalo.EDU (electra.cse.Buffalo.EDU [128.205.32.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id D0BA143D1D for ; Fri, 27 Aug 2004 20:42:18 +0000 (GMT) (envelope-from kensmith@cse.Buffalo.EDU) Received: from electra.cse.Buffalo.EDU (kensmith@localhost [127.0.0.1]) i7RKgHTH001401; Fri, 27 Aug 2004 16:42:18 -0400 (EDT) Received: (from kensmith@localhost) by electra.cse.Buffalo.EDU (8.12.10/8.12.9/Submit) id i7RKgHto001400; Fri, 27 Aug 2004 16:42:17 -0400 (EDT) Date: Fri, 27 Aug 2004 16:42:17 -0400 From: Ken Smith To: Colin Percival Message-ID: <20040827204217.GC29928@electra.cse.Buffalo.EDU> References: <1076237332.20040827215245@kaluga.ru> <20040827193605.GC28442@electra.cse.Buffalo.EDU> <6.1.0.6.1.20040827124846.03ac02d0@popserver.sfu.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6.1.0.6.1.20040827124846.03ac02d0@popserver.sfu.ca> User-Agent: Mutt/1.4.1i cc: Pavel Merdine cc: Ken Smith cc: freebsd-stable@freebsd.org Subject: Re: ffs_alloc panic patch X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Aug 2004 20:42:19 -0000 On Fri, Aug 27, 2004 at 12:55:39PM -0700, Colin Percival wrote: > At 12:36 27/08/2004, Ken Smith wrote: > > ... Here you again wind up in a > > situation where the filesystem data structures on the disk can > > become corrupted. Typically at some point the ffs code will > > recognize that the metadata is incorrect and again a panic is > > better than trying to carry on pretending nothing is wrong. > > Shouldn't a corrupt filesystem be handled by forcibly dismounting it, > rather than invoking panic()? We certainly don't want to keep on using > a corrupt filesystem, but we should attempt to isolate a single failing > piece of hardware rather than allowing it to bring down the entire > system. It's too hard to decide whether the filesystem in question can be lived without. Certainly / can't be lived without. The contents of /var can't be lived without, it may or may not be on a separate filesystem. Suppose I chose to have /tmp on a separate filesystem and we actually did manage to successfully unmount it (hee hee hee :-), typically the directory permissions of the /tmp directory as represented in the root filesystem itself are wrong (setting it to be rwxrwxrwt was done after it had been mounted and it's the permissions from the mounted filesystem that are normally seen). Can we live without /usr? Did we set up cron jobs that will do really really really nasty things if the filesystem doesn't look the way it should because something got unmounted we weren't expecting? Suppose we're running a mirror site with a large repository in /ftp, it fails, but the machine remains up and running - downstream mirror sites connect, see nothing, and dutifully remove everything on their site. Too many decisions the kernel shouldn't make... :-) The isolating a single failing piece of hardware thing is RAID turf. -- Ken Smith - From there to here, from here to | kensmith@cse.buffalo.edu there, funny things are everywhere. | - Theodore Geisel |