Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 1 Dec 2007 13:01:30 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        "Matthew D. Fuller" <fullermd@over-yonder.net>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: File remove problem
Message-ID:  <20071201125029.O15170@delplex.bde.org>
In-Reply-To: <20071130174431.GE31891@over-yonder.net>
References:  <474F4E46.8030109@nokia.com> <20071130112043.H7217@besplex.bde.org> <474F69A7.9090404@nokia.com> <20071130033743.GC31891@over-yonder.net> <20071130164034.D12284@delplex.bde.org> <20071130165529.V954@besplex.bde.org> <20071130174431.GE31891@over-yonder.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 30 Nov 2007, Matthew D. Fuller wrote:

> On Fri, Nov 30, 2007 at 05:00:21PM +1100 I heard the voice of
> Bruce Evans, and lo! it spake thus:
>>
>> Oops, this is missing a rm, and doesn't work with it.
>
> Last year, it used to not cause the softdep_waitidle messages and
> prevent the fs from being remounted.  Instead, it would give an error
> like:
>
> hostname kernel: /: update error: blocks 28 files 2

softdep_waitidle() is new.  It now detects the problem earlier and
handles it more robustly by not allowing the mount -u.  Well, maybe
this is less robust since it also doesn't allow unmount.

> and WOULD remount it, and even set the clean flag, but would still
> leave turds lying around that would need a manual fsck to clean up
> (fsck -p obviously would completely skip it, since it was marked
> clean).  It was early this year that it moved from that annoying to

It also shows bugs in fsck:
- even with the file system not marked clean (with later versions),
   fsck -p doesn't notice the problem.
- fsck notices the problem, but takes 2 or 3 passes to fix it, and
   doesn't notice that it needs several passes.

> the "locked fs" crippling variant.  (n.b.: I don't have any real
> evidence that it's a mutation of the same problem, rather than two
> different ones, aside from the trigger condition apparently being the
> same, and the newer completely replacing the older.)

I think it is the same.  softdep_waitidle() just waits a bit to flush
the dependencies after starting the flushing, but the bug gives an
unflushable dependency so the wait always times out.

>> It takes a reboot per test.

And has remarkable timing dependencies.  Once I got into a state in which
the bug didn't appear when exercised in a loop with the same delays that
seemed to cause it fairly deterministically other times.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071201125029.O15170>