From owner-freebsd-hackers  Sat Sep 28 13:26:39 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id NAA24308
          for hackers-outgoing; Sat, 28 Sep 1996 13:26:39 -0700 (PDT)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id NAA24290
          for <FreeBSD-hackers@freebsd.org>; Sat, 28 Sep 1996 13:26:36 -0700 (PDT)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id NAA03244; Sat, 28 Sep 1996 13:25:31 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199609282025.NAA03244@phaeton.artisoft.com>
Subject: Re: cvs commit: src/sbin/fsdb fsdb.c
To: gurney_j@resnet.uoregon.edu
Date: Sat, 28 Sep 1996 13:25:30 -0700 (MST)
Cc: terry@lambert.org, FreeBSD-hackers@freebsd.org
In-Reply-To: <Pine.NEB.3.95.960928013546.291R-100000@nike> from "John-Mark Gurney" at Sep 28, 96 02:21:41 am
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> one quick question... any reason why FreeBSD doesn't have an unlink
> command?  at least only accessable to root?

The unlink command was a hold-over from the rename operation being
outside the kernel (rename = non-idempotent link new + unlink old).

Typically, an unlink command would be useful only for a directory
hard link.

Personally, I think directory hard links should not be allowed.  There
is nothing in POSIX requiring them, and they only cause problems for
recursive descent mechansims that (erroneously) fail to check that the
inode for '..' is the inode you descended from.  If you do not include
directory hard links as an option, you can store (in directory inodes)
the inode of the parent, which would give you a much better "pwd"
mechanism, as well as allowing for NetWare-style "trustee" reverse
inheritance.

> > So: can you tell me if the condition resulted from fsck not catching
> > it after a crash, or if it resulted from normal operation of the FS?
> 
> my problem was from a system crash with the fs mount async...  I know this
> isn't good but I've had a large number of crashes when I didn't lose any
> data... and it does improve speed (i.e. config kernname completes in about
> a 1second :) )...

The fsck can only back out one level of consistency if you are mounted
async.

That said, this particular problem is detectable, and the error is in fsck.
David Greenman already posted Kirk's note to that effect.


> the errors that I got when trying to access the directory was:
> /usr:bad dir ino 85312 at offset 0: mangled entry
> panic:bad dir

Yeah; it means the entry at the directory offset was invalid data;
that will hapen if the directory block is not self-consistent, which it
is not required to be once deallocated.

It's possible, running async as you were, that the directory data block
was reallocated for file data.  THere's really no way to deterministically
recover this (other than to prevent it ever happening in the first place
by not running async).  If you want to prevent it, run sync (and if you
don't like the speed of sync, by all means, implement soft updates ;-)).


> I would fix it, but it never really got fixed (as it's invalid)...  and
> after a reboot fsck again reported the above error...

Yes.  fsck is in error.  Note that the fsck tool is only capable of
restoring the FS to a potentially consistent state.  If there is more
than one outstanding event unprocessed (each FS transaction can be
considered an event), then there are multiple potential precursor
states.  For N outstanding transactions, the number of precursor
states is 2^(N-1).  So if you had, say, 11 outstanding uncommitted
async metadata writes, then you had only a 1 in 1024 chance of fsck
guessing the right one.

This is the general problem, in a nutshell, with async.  The stochastic
graph is similar for any FS considered as a set of transaction events
which can be scheduled in an unordered list -- so EXT2FS's default mode
is probably not a good idea -- turning on sync is better.

Again, soft updates are within 5% of memory bandwidth, so if you were
to implement them as your ordering mechanism instead of sync writes,
you would effectively get async speeds without the danger.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.