Date: Fri, 27 Aug 2004 16:42:17 -0400 From: Ken Smith <kensmith@cse.Buffalo.EDU> To: Colin Percival <colin.percival@wadham.ox.ac.uk> Cc: freebsd-stable@freebsd.org Subject: Re: ffs_alloc panic patch Message-ID: <20040827204217.GC29928@electra.cse.Buffalo.EDU> In-Reply-To: <6.1.0.6.1.20040827124846.03ac02d0@popserver.sfu.ca> References: <1076237332.20040827215245@kaluga.ru> <20040827193605.GC28442@electra.cse.Buffalo.EDU> <6.1.0.6.1.20040827124846.03ac02d0@popserver.sfu.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Aug 27, 2004 at 12:55:39PM -0700, Colin Percival wrote: > At 12:36 27/08/2004, Ken Smith wrote: > > ... Here you again wind up in a > > situation where the filesystem data structures on the disk can > > become corrupted. Typically at some point the ffs code will > > recognize that the metadata is incorrect and again a panic is > > better than trying to carry on pretending nothing is wrong. > > Shouldn't a corrupt filesystem be handled by forcibly dismounting it, > rather than invoking panic()? We certainly don't want to keep on using > a corrupt filesystem, but we should attempt to isolate a single failing > piece of hardware rather than allowing it to bring down the entire > system. It's too hard to decide whether the filesystem in question can be lived without. Certainly / can't be lived without. The contents of /var can't be lived without, it may or may not be on a separate filesystem. Suppose I chose to have /tmp on a separate filesystem and we actually did manage to successfully unmount it (hee hee hee :-), typically the directory permissions of the /tmp directory as represented in the root filesystem itself are wrong (setting it to be rwxrwxrwt was done after it had been mounted and it's the permissions from the mounted filesystem that are normally seen). Can we live without /usr? Did we set up cron jobs that will do really really really nasty things if the filesystem doesn't look the way it should because something got unmounted we weren't expecting? Suppose we're running a mirror site with a large repository in /ftp, it fails, but the machine remains up and running - downstream mirror sites connect, see nothing, and dutifully remove everything on their site. Too many decisions the kernel shouldn't make... :-) The isolating a single failing piece of hardware thing is RAID turf. -- Ken Smith - From there to here, from here to | kensmith@cse.buffalo.edu there, funny things are everywhere. | - Theodore Geisel |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040827204217.GC29928>