Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 07 Jan 2006 10:45:01 -0700
From:      Scott Long <scottl@samsco.org>
To:        David Rhodus <drhodus@machdep.com>
Cc:        freebsd-current@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject:   Re: It still here... panic: ufs_dirbad: bad dir
Message-ID:  <43BFFE1D.4070502@samsco.org>
In-Reply-To: <fe77c96b0601070904n57d00a21mdf94281bc812dc50@mail.gmail.com>
References:  <20060102222723.GA1754@dragon.NUXI.org>	 <43BA9C5C.9010307@samsco.org>	 <20060106200009.GA53067@garage.freebsd.pl>	 <43BFF041.8070300@samsco.org> <fe77c96b0601070904n57d00a21mdf94281bc812dc50@mail.gmail.com>

index | next in thread | previous in thread | raw e-mail

David Rhodus wrote:
> On 1/7/06, Scott Long <scottl@samsco.org> wrote:
> 
>>Pawel Jakub Dawidek wrote:
>>
>>
>>>On Tue, Jan 03, 2006 at 08:46:36AM -0700, Scott Long wrote:
>>>+> David O'Brien wrote:
>>>+>
>>>+> >Just in case anyone thought the bug had been fixed...
>>>+> >FreeBSD 7.0-CURRENT #531: Mon Jan  2 11:32:17 PST 2006 i386
>>>+> >panic: ufs_dirbad: bad dir
>>>+> >cpuid = 1
>>>+> >KDB: stack backtrace:
>>>+> >kdb_backtrace(c06c9ba1,1,c06c03c6,eae718c8,c8a91480) at 0xc053657e = kdb_backtrace+0x2e
>>>+> >panic(c06c03c6,c85bf1f8,dade11,580,c06c0380) at 0xc0516618 = panic+0x128
>>>+> >ufs_dirbad(c9171bdc,580,c06c0380,0,eae7193c) at 0xc0616e4d = ufs_dirbad+0x4d
>>>+> >ufs_lookup(eae719e8,c916c528,eae71bc4,c916c528,eae71a24) at 0xc06165cd = ufs_lookup+0x3ad
>>>+> >VOP_CACHEDLOOKUP_APV(c06f2a80,eae719e8,eae71bc4,c8a91480,cac28d80) at 0xc068cd4e = VOP_CACHEDLOOKUP_APV+0x9e
>>>+> >vfs_cache_lookup(eae71a90,eae71a90,c916c528,c916c528,eae71bc4) at 0xc057275a = vfs_cache_lookup+0xca
>>>+> >VOP_LOOKUP_APV(c06f2a80,eae71a90,c8a91480,c106fc88,0) at 0xc068cc66 = VOP_LOOKUP_APV+0xa6
>>>+> >lookup(eae71b9c,0,c06b5c8e,b6,c057f7ed) at 0xc057760e = lookup+0x44e
>>>+> >namei(eae71b9c,eae71b3c,60,0,c8a91480) at 0xc0576ecf = namei+0x44f
>>>+> >kern_stat(c8a91480,8106f20,0,eae71c10,e0) at 0xc05863dd = kern_stat+0x3d
>>>+> >stat(c8a91480,eae71d04,8,43c,c8a91480) at 0xc058636f = stat+0x2f
>>>+> >syscall(3b,3b,3b,80dbe80,8106f20) at 0xc0682b43 = syscall+0x323
>>>+> >Xint0x80_syscall() at 0xc066d33f = Xint0x80_syscall+0x1f
>>>+>
>>>+> Please include the console printf that is right about the panic message.
>>>+> It will say either something about a mangled entry or an isize too
>>>+> small.  Since this problem is happening consistently for you, but there
>>>+> seem to be no other problem reports from others, I'd highly suspect that
>>>+> you have filesystem damage that isn't getting detected by fsck.  I assume that you are running fsck in the foreground and not in the background, yes?  The easiest solution
>>>+> here might be to figure out which
>>>+> directory is causing the problem, and just clri its inode and then clean
>>>+> up the mess.
>>>
>>>I'm able to reproduce it with newly newfs(8)ed file system:
>>>
>>>/mnt: bad dir ino 17382405 at offset 0: mangled entry
>>>panic: ufs_dirbad: bad dir
>>>KDB: enter: panic
>>>[...]
>>>db> tr
>>>Tracing pid 427 tid 100057 td 0xc7ccaa80
>>>kdb_enter(c060029a,c065c020,c0610849,f6b228c0,100) at kdb_enter+0x30
>>>panic(c0610849,c7914210,1093c05,0,c0610803) at panic+0xce
>>>ufs_dirbad(cb2b4b58,0,c0610803,0,f6b22934) at ufs_dirbad+0x4e
>>>ufs_lookup(f6b229e4,c061b519,cb092c60,cb092c60,f6b22b64) at ufs_lookup+0x39f
>>>VOP_CACHEDLOOKUP_APV(c063a7e0,f6b229e4,f6b22b64,c7ccaa80,c7d52b80) at VOP_CACHEDLOOKUP_APV+0xc4
>>>vfs_cache_lookup(f6b22a8c,f6b22a8c,0,cb092c60,0) at vfs_cache_lookup+0xc8
>>>VOP_LOOKUP_APV(c063a7e0,f6b22a8c,c7ccaa80,38,0) at VOP_LOOKUP_APV+0xa6
>>>lookup(f6b22b3c,0,c060880c,b5,c0511d45) at lookup+0x454
>>>namei(f6b22b3c,f6b22b8c,60,0,c7ccaa80) at namei+0x441
>>>kern_lstat(c7ccaa80,8059800,0,f6b22c10,2) at kern_lstat+0x5b
>>>lstat(c7ccaa80,f6b22d04,8,43c,c065c740) at lstat+0x2f
>>>syscall(805003b,807003b,bfbf003b,805f19c,bfbfeba0) at syscall+0x325
>>>Xint0x80_syscall() at Xint0x80_syscall+0x1f
>>>--- syscall (190, FreeBSD ELF32, lstat), eip = 0x28176efb, esp = 0xbfbfe90c, ebp = 0xbfbfea48 ---
>>>
>>
>>Since you can reproduce it, can you find out which test it is failing?
>>At the very least we need to add the test to fsck.
>>
>>Scott
> 
> 
> The main problem with dirbad panics is that the corruption accrued a
> long time ago, so a backtrace usually doesn't provide enough
> information to find out what went wrong.
> 
> Doing a fsck _should_ fix the filesystem corruption, but only after
> the problem has already accrued.  There are a few cases in which fsck
> needs to restart its current scan level or it can leave corruption
> inside the filesystem while marking the partition clean.
> 
> -DR

Yes, I'm well aware of all of this, that's why I'm asking Pawel to
determine which test is failing so we can find out why fsck isn't
catching it.

Scott



home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43BFFE1D.4070502>