Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Feb 1997 21:48:47 -0500 (EST)
From:      Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>
To:        ponds!freefall.cdrom.com!freebsd-hackers
Subject:   Re: More on bad dir panics
Message-ID:  <199702260248.VAA19324@lakes.water.net>

next in thread | raw e-mail | index | archive | help

> 
> 
> I have been trying to look around the crash dumps, as they are plentiful
> these days (twice a day seems to be the current rate).  These always happen
> at the same point and all crashes are similar, crash occurs on directory
> lookup stombling over a block which contains something else than directory
> data.

 This "smells" very similar to my problems... perhaps we can devine
the intersection of these two problems and hit on a solution?

 Things I've determined:

	o) This can happen in a very light load.
	o) It happens on several types of hardware (SCSI, IDE, 386-586.)
	
 The problem appears to be related to inode allocation - in that an
inode is marked in the free inodes array as "available" (the bit isn't
set) and then, some other later code reads the data from the disk
and checks a field (for the "dup alloc" panic, it's the "mode" field)
and discovers that "oops - it, in fact was being used."

 Does that sound familiar?

 Some other interesting observations:

	o) This can happen with a brand-new file system; if you write
		trash the device, then do a newfs.  Newfs believes it
	  	has correctly filled in all the inodes with 0, but some
		(at least one in my tests) aren't correctly zero'd.

	o) The problem "strikes" and gets progressively worse until
		the file system simply falls apart.  I'm up to twice
		a day myself on my news server;  also, a find in
		/usr/spool/news now produces a lot of "Bad file descriptor"
		messages, indicating other file system problems that
		fsck didn't correct.

	o) Running fsck once isn't enough to restore a file system to
		a semi-usuable state; if you fsck it once, try again,
		you'll sometimes notice more corrections.

	o) This isn't "new" - it's something I've experience in all
		2.1 releases (although, until now, I was about the
		sole reporter of the problem.)  I mention this to try
		and narrow the scope of what we're looking for.  It was
		something that happened in the 2.1.0 time-frame.


	- Dave Rivers -




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199702260248.VAA19324>