From owner-freebsd-fs Mon May 14 20:44:57 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 31C8137B423; Mon, 14 May 2001 20:44:49 -0700 (PDT) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.3/8.11.2) id f4F3iVI45699; Mon, 14 May 2001 20:44:31 -0700 (PDT) (envelope-from dillon) Date: Mon, 14 May 2001 20:44:31 -0700 (PDT) From: Matt Dillon Message-Id: <200105150344.f4F3iVI45699@earth.backplane.com> To: Kris Kennaway Cc: Greg Lehey , Kris Kennaway , Terry Lambert , Kirk McKusick , Mikhail Teterin , cvs-committers@FreeBSD.ORG, cvs-all@FreeBSD.ORG, Ruslan Ermilov , fs@FreeBSD.ORG Subject: Re: [kris@obsecurity.org: Re: cvs commit: src/etc rc] References: <200105132342.QAA21879@beastie.mckusick.com> <200105142334.QAA05923@usr06.primenet.com> <20010515115630.H59553@wantadilla.lemis.com> <20010514193332.A85465@xor.obsecurity.org> <20010515120558.M59553@wantadilla.lemis.com> <20010514202707.B93481@xor.obsecurity.org> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I have to say, just IMHO, that as much as I like the concept of a background fsck, I will never ever in my life use the feature. I'll use the snapshots, definitely. But not the background fsck. It is plain and simply too dangerous, *especially* on large partitions where one has a lot to lose if something goes wrong. UFS just isn't designed to be able to guarentee recovery, even if softupdates can't fail theoretically. We would need a log or journal to reach the safety factor that something like XFS or ReiserFS can theoretically achieve. I welcome Kirk's addition of the feature, but I have to say that, IMHO, the *default* should not be to background fsck. The default should be to remain safe and foreground fsck. If I have a huge partition that I intend to store a database in (for example), then judicious use of newfs's -c and -i options is sufficient to reduce fsck times. Ultimately I believe that as storage systems get larger, the only safe solution is going to be replicated, distributed, quorum-based transactional filesystems. That way if a node goes down, it doesn't matter if it takes an hour to validate itself before coming back up. RAID-XYZ doesn't hack it -- it's still vulnerable to filesystem corruption due to software. Having written a database that does this sort of replication (in a read-write transactional environment), I've become a great believer in it and I think it is the only future for storing huge amounts of data. I think it is possible to solve the slow-write problem (needing a quorum to commit a write) through the use of a client-side cache, similar to what NFS does (note: I haven't done this for my database yet, but I can see how it could be done for a filesystem). It gives me great peace of mind to know that I can pull the plug on an entire colocation site and have the realtime users of our product NOT notice that it happened. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message