From owner-freebsd-current Mon Nov 25 15:29:15 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E4AB37B401; Mon, 25 Nov 2002 15:29:13 -0800 (PST) Received: from swan.mail.pas.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by mx1.FreeBSD.org (Postfix) with ESMTP id 97E9643EC2; Mon, 25 Nov 2002 15:29:12 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0340.cvx40-bradley.dialup.earthlink.net ([216.244.43.85] helo=mindspring.com) by swan.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 18GSdP-0001OQ-00; Mon, 25 Nov 2002 15:27:24 -0800 Message-ID: <3DE2B18C.AA33F92F@mindspring.com> Date: Mon, 25 Nov 2002 15:26:04 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Kris Kennaway Cc: Robert Watson , Mikhail Teterin , current@FreeBSD.ORG Subject: Re: -current unusable after a crash References: <200211250959.39594.mi+mx@aldan.algebra.com> <20021125172445.GA8953@rot13.obsecurity.org> <3DE29DE6.CDD96F3F@mindspring.com> <20021125221748.GA11747@rot13.obsecurity.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Kris Kennaway wrote: > On Mon, Nov 25, 2002 at 02:02:14PM -0800, Terry Lambert wrote: > > I don't think this is really possible. > > Yeah :( > > > If you made system dumps mandatory (or marked swap with a non-dump > > header in case of panic), this still would not handle the "silent > > reboot", "double panic", or "single panic with disk I/O trashed" > > cases. 8-(. > > And the panics that affect the disk/filesystem are likely to not give > a crashdump, but at the same time are likely to cause FS problems for > bgfsck :-( Actually, the worst problems come when the corruption does not result in a crash subsequently. If you just crashed again, you could simply set in the superblock a flag that said "background fsck in progress", and if that flag was set at boot time, then do a full fsck (knowing you died during a background fsck). If you don't get a second crash, and you reboot, you're screwed. You could add another utility to say "force full fsck" -- basically, to set the flag manually. This is a pain because you have to do it through an fcntl() or ioctl(), since there are no block devices to use to do the work, and you can't open a mounted device to write it, even if you know what you are doing, the OS enforces like it's smarter than you. We ran into exactly this same problem in the InterJet, when we first paid Kirk to have soft updates ported to FreeBSD (I actually did the preliminary "make it compile" work, and Julian did most of the debugging; I helped some after that, but my boss didn't like me doing it). The point was to get rid of the need for a UPS in the InterJet. A log structured FS doesn't actually have this problem, but is a real pain because of the need for a "cleaner" to run constantly, to garbage collect, which makes thing that used to be deterministic time take variable time. Not very good for multimedia or streaming content serving. The InterJet handled this by having a DC holdup time following AC failure notification, which was enough to throw a stick into the spokes, to prevent the wheels from turning, and the bicycle falling over the cliff. Another way to handle it would be CMOS, with a BIOS initialization (e.g. set bit 1 of the "crash state") that didn't effect the bits that indicated the failure mode. Unfortunately, the computer manufacturers have not really agreed on a standard for this sort of thing, nor do they think anyone in OS space or userland should be able to own a section of CMOS memory (no OS allocation policy, tagging, etc.). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message