Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 13 May 1995 19:31:55 -0500 (CDT)
From:      Mike Pritchard <pritc003@maroon.tc.umn.edu>
To:        bde@zeta.org.au (Bruce Evans)
Cc:        uhclem@nemesis.lonestar.org, current@FreeBSD.org
Subject:   Re: bin/389: Problem #FDIV024
Message-ID:  <199505140031.TAA00204@mpp.com>
In-Reply-To: <199505090808.SAA00484@godzilla.zeta.org.au> from "Bruce Evans" at May 9, 95 06:08:42 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >>Number:         389
> >>Category:       bin
> >>Synopsis:       Simultaneous creation/deletion of dirs corrupts filesystem [FDIV024]
> 
> I was able to reproduce the hang under -current:
> 
>   UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT TT       TIME COMMAND
>    15  1669   159   0 -14  0   172  488 uihget D+   v2    0:00.05 ls -Fg
>    15  1619   160   2 -14  0   296  616 ufslk2 D    v3    0:02.19 tar xf ../bin
>    15  1624   160   0 -18  0   296  620 vodead D    v3    0:00.88 tar xf ../bin
>...
>    15  1678   160   1 -14  0   160  476 ufslk2 D+   v3    0:00.04 ls -Fg
> 
> The problems seem to be easier to reproduce on a small file system.  I didn't
> see any on a new 512MB file system but I soon saw them on a 20000 block file
> system using bin.tar created from the current /usr/src/bin and running lots
> of tar xf's and rm -rf's concurrently.
> 
> This was running as non-root.  As root you have to worry about something
> unlinking "." and ".." directory entries.  I don't think anything actually
> does, but tar --unlink might.
> 
> Bruce

I think I just duplicated the problem in a slightly different manner
with a -current kernel that is about 2.5 weeks old at this point.

I was in /usr/src doing a "make world", and it had just gotten to the
point where it was attempting to compile stuff.  I was also running
a "du -s /usr" at the same time.  The compile blew off with 
"bad file descriptor".  The du also blew off somewhere in the same time
frame, complaining about files in /usr/src.

Checking the directory the build failed in showed that I had several
files that were still present in the directory, but the i-nodes had 
been released, thus the "bad file descriptor" problems.

fsck reported about 4 files in the bad build directory that pointed
to unallocated i-nodes, and a few others in some other directories.
All the files were either binaries or .o's that would have been removed
by "make clean".

All my file systems were just fine about 10 hours ago when I
last booted, and the machine was idle all day until I started
the make/du, so I'm pretty sure that this is when the corruption
took place.

Since "du" is only opening up directories read-only, and all it does
is call "stat" (via the fts* routines), it looks like something in
the unlink/directory update routines isn't locking the directory
properly, or releasing the lock a bit too soon.  It also might have
something to do with a stat() being active on the file being
deleted.  I was running as root in all of the above cases, in
case anyone was wondering.
-- 
Mike Pritchard
pritc003@maroon.tc.umn.edu
"Go that way.  Really fast.  If something gets in your way, turn"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199505140031.TAA00204>