Date: Mon, 25 Nov 2002 15:26:04 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Kris Kennaway <kris@obsecurity.org> Cc: Robert Watson <rwatson@FreeBSD.ORG>, Mikhail Teterin <mi+mx@aldan.algebra.com>, current@FreeBSD.ORG Subject: Re: -current unusable after a crash Message-ID: <3DE2B18C.AA33F92F@mindspring.com> References: <200211250959.39594.mi%2Bmx@aldan.algebra.com> <Pine.NEB.3.96L.1021125102358.33619A-100000@fledge.watson.org> <20021125172445.GA8953@rot13.obsecurity.org> <3DE29DE6.CDD96F3F@mindspring.com> <20021125221748.GA11747@rot13.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Kris Kennaway wrote: > On Mon, Nov 25, 2002 at 02:02:14PM -0800, Terry Lambert wrote: > > I don't think this is really possible. > > Yeah :( > > > If you made system dumps mandatory (or marked swap with a non-dump > > header in case of panic), this still would not handle the "silent > > reboot", "double panic", or "single panic with disk I/O trashed" > > cases. 8-(. > > And the panics that affect the disk/filesystem are likely to not give > a crashdump, but at the same time are likely to cause FS problems for > bgfsck :-( Actually, the worst problems come when the corruption does not result in a crash subsequently. If you just crashed again, you could simply set in the superblock a flag that said "background fsck in progress", and if that flag was set at boot time, then do a full fsck (knowing you died during a background fsck). If you don't get a second crash, and you reboot, you're screwed. You could add another utility to say "force full fsck" -- basically, to set the flag manually. This is a pain because you have to do it through an fcntl() or ioctl(), since there are no block devices to use to do the work, and you can't open a mounted device to write it, even if you know what you are doing, the OS enforces like it's smarter than you. We ran into exactly this same problem in the InterJet, when we first paid Kirk to have soft updates ported to FreeBSD (I actually did the preliminary "make it compile" work, and Julian did most of the debugging; I helped some after that, but my boss didn't like me doing it). The point was to get rid of the need for a UPS in the InterJet. A log structured FS doesn't actually have this problem, but is a real pain because of the need for a "cleaner" to run constantly, to garbage collect, which makes thing that used to be deterministic time take variable time. Not very good for multimedia or streaming content serving. The InterJet handled this by having a DC holdup time following AC failure notification, which was enough to throw a stick into the spokes, to prevent the wheels from turning, and the bicycle falling over the cliff. Another way to handle it would be CMOS, with a BIOS initialization (e.g. set bit 1 of the "crash state") that didn't effect the bits that indicated the failure mode. Unfortunately, the computer manufacturers have not really agreed on a standard for this sort of thing, nor do they think anyone in OS space or userland should be able to own a section of CMOS memory (no OS allocation policy, tagging, etc.). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DE2B18C.AA33F92F>