From owner-freebsd-current Sat May 13 17:34:25 1995 Return-Path: current-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id RAA16651 for current-outgoing; Sat, 13 May 1995 17:34:25 -0700 Received: from mpp.com (dialup-3-188.gw.umn.edu [134.84.101.188]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id RAA16644 for ; Sat, 13 May 1995 17:34:15 -0700 Received: (from mpp@localhost) by mpp.com (8.6.11/8.6.9) id TAA00204; Sat, 13 May 1995 19:31:55 -0500 From: Mike Pritchard Message-Id: <199505140031.TAA00204@mpp.com> Subject: Re: bin/389: Problem #FDIV024 To: bde@zeta.org.au (Bruce Evans) Date: Sat, 13 May 1995 19:31:55 -0500 (CDT) Cc: uhclem@nemesis.lonestar.org, current@FreeBSD.org In-Reply-To: <199505090808.SAA00484@godzilla.zeta.org.au> from "Bruce Evans" at May 9, 95 06:08:42 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 2674 Sender: current-owner@FreeBSD.org Precedence: bulk > >>Number: 389 > >>Category: bin > >>Synopsis: Simultaneous creation/deletion of dirs corrupts filesystem [FDIV024] > > I was able to reproduce the hang under -current: > > UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND > 15 1669 159 0 -14 0 172 488 uihget D+ v2 0:00.05 ls -Fg > 15 1619 160 2 -14 0 296 616 ufslk2 D v3 0:02.19 tar xf ../bin > 15 1624 160 0 -18 0 296 620 vodead D v3 0:00.88 tar xf ../bin >... > 15 1678 160 1 -14 0 160 476 ufslk2 D+ v3 0:00.04 ls -Fg > > The problems seem to be easier to reproduce on a small file system. I didn't > see any on a new 512MB file system but I soon saw them on a 20000 block file > system using bin.tar created from the current /usr/src/bin and running lots > of tar xf's and rm -rf's concurrently. > > This was running as non-root. As root you have to worry about something > unlinking "." and ".." directory entries. I don't think anything actually > does, but tar --unlink might. > > Bruce I think I just duplicated the problem in a slightly different manner with a -current kernel that is about 2.5 weeks old at this point. I was in /usr/src doing a "make world", and it had just gotten to the point where it was attempting to compile stuff. I was also running a "du -s /usr" at the same time. The compile blew off with "bad file descriptor". The du also blew off somewhere in the same time frame, complaining about files in /usr/src. Checking the directory the build failed in showed that I had several files that were still present in the directory, but the i-nodes had been released, thus the "bad file descriptor" problems. fsck reported about 4 files in the bad build directory that pointed to unallocated i-nodes, and a few others in some other directories. All the files were either binaries or .o's that would have been removed by "make clean". All my file systems were just fine about 10 hours ago when I last booted, and the machine was idle all day until I started the make/du, so I'm pretty sure that this is when the corruption took place. Since "du" is only opening up directories read-only, and all it does is call "stat" (via the fts* routines), it looks like something in the unlink/directory update routines isn't locking the directory properly, or releasing the lock a bit too soon. It also might have something to do with a stat() being active on the file being deleted. I was running as root in all of the above cases, in case anyone was wondering. -- Mike Pritchard pritc003@maroon.tc.umn.edu "Go that way. Really fast. If something gets in your way, turn"