Date: Sat, 7 Jan 2006 12:04:35 -0500 From: David Rhodus <drhodus@machdep.com> To: Scott Long <scottl@samsco.org> Cc: freebsd-current@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org> Subject: Re: It still here... panic: ufs_dirbad: bad dir Message-ID: <fe77c96b0601070904n57d00a21mdf94281bc812dc50@mail.gmail.com> In-Reply-To: <43BFF041.8070300@samsco.org> References: <20060102222723.GA1754@dragon.NUXI.org> <43BA9C5C.9010307@samsco.org> <20060106200009.GA53067@garage.freebsd.pl> <43BFF041.8070300@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/7/06, Scott Long <scottl@samsco.org> wrote: > Pawel Jakub Dawidek wrote: > > > On Tue, Jan 03, 2006 at 08:46:36AM -0700, Scott Long wrote: > > +> David O'Brien wrote: > > +> > > +> >Just in case anyone thought the bug had been fixed... > > +> >FreeBSD 7.0-CURRENT #531: Mon Jan 2 11:32:17 PST 2006 i386 > > +> >panic: ufs_dirbad: bad dir > > +> >cpuid =3D 1 > > +> >KDB: stack backtrace: > > +> >kdb_backtrace(c06c9ba1,1,c06c03c6,eae718c8,c8a91480) at 0xc053657e = =3D kdb_backtrace+0x2e > > +> >panic(c06c03c6,c85bf1f8,dade11,580,c06c0380) at 0xc0516618 =3D pani= c+0x128 > > +> >ufs_dirbad(c9171bdc,580,c06c0380,0,eae7193c) at 0xc0616e4d =3D ufs_= dirbad+0x4d > > +> >ufs_lookup(eae719e8,c916c528,eae71bc4,c916c528,eae71a24) at 0xc0616= 5cd =3D ufs_lookup+0x3ad > > +> >VOP_CACHEDLOOKUP_APV(c06f2a80,eae719e8,eae71bc4,c8a91480,cac28d80) = at 0xc068cd4e =3D VOP_CACHEDLOOKUP_APV+0x9e > > +> >vfs_cache_lookup(eae71a90,eae71a90,c916c528,c916c528,eae71bc4) at 0= xc057275a =3D vfs_cache_lookup+0xca > > +> >VOP_LOOKUP_APV(c06f2a80,eae71a90,c8a91480,c106fc88,0) at 0xc068cc66= =3D VOP_LOOKUP_APV+0xa6 > > +> >lookup(eae71b9c,0,c06b5c8e,b6,c057f7ed) at 0xc057760e =3D lookup+0x= 44e > > +> >namei(eae71b9c,eae71b3c,60,0,c8a91480) at 0xc0576ecf =3D namei+0x44= f > > +> >kern_stat(c8a91480,8106f20,0,eae71c10,e0) at 0xc05863dd =3D kern_st= at+0x3d > > +> >stat(c8a91480,eae71d04,8,43c,c8a91480) at 0xc058636f =3D stat+0x2f > > +> >syscall(3b,3b,3b,80dbe80,8106f20) at 0xc0682b43 =3D syscall+0x323 > > +> >Xint0x80_syscall() at 0xc066d33f =3D Xint0x80_syscall+0x1f > > +> > > +> Please include the console printf that is right about the panic mess= age. > > +> It will say either something about a mangled entry or an isize too > > +> small. Since this problem is happening consistently for you, but th= ere > > +> seem to be no other problem reports from others, I'd highly suspect = that > > +> you have filesystem damage that isn't getting detected by fsck. I a= ssume that you are running fsck in the foreground and not in the background= , yes? The easiest solution > > +> here might be to figure out which > > +> directory is causing the problem, and just clri its inode and then c= lean > > +> up the mess. > > > > I'm able to reproduce it with newly newfs(8)ed file system: > > > > /mnt: bad dir ino 17382405 at offset 0: mangled entry > > panic: ufs_dirbad: bad dir > > KDB: enter: panic > > [...] > > db> tr > > Tracing pid 427 tid 100057 td 0xc7ccaa80 > > kdb_enter(c060029a,c065c020,c0610849,f6b228c0,100) at kdb_enter+0x30 > > panic(c0610849,c7914210,1093c05,0,c0610803) at panic+0xce > > ufs_dirbad(cb2b4b58,0,c0610803,0,f6b22934) at ufs_dirbad+0x4e > > ufs_lookup(f6b229e4,c061b519,cb092c60,cb092c60,f6b22b64) at ufs_lookup+= 0x39f > > VOP_CACHEDLOOKUP_APV(c063a7e0,f6b229e4,f6b22b64,c7ccaa80,c7d52b80) at V= OP_CACHEDLOOKUP_APV+0xc4 > > vfs_cache_lookup(f6b22a8c,f6b22a8c,0,cb092c60,0) at vfs_cache_lookup+0x= c8 > > VOP_LOOKUP_APV(c063a7e0,f6b22a8c,c7ccaa80,38,0) at VOP_LOOKUP_APV+0xa6 > > lookup(f6b22b3c,0,c060880c,b5,c0511d45) at lookup+0x454 > > namei(f6b22b3c,f6b22b8c,60,0,c7ccaa80) at namei+0x441 > > kern_lstat(c7ccaa80,8059800,0,f6b22c10,2) at kern_lstat+0x5b > > lstat(c7ccaa80,f6b22d04,8,43c,c065c740) at lstat+0x2f > > syscall(805003b,807003b,bfbf003b,805f19c,bfbfeba0) at syscall+0x325 > > Xint0x80_syscall() at Xint0x80_syscall+0x1f > > --- syscall (190, FreeBSD ELF32, lstat), eip =3D 0x28176efb, esp =3D 0x= bfbfe90c, ebp =3D 0xbfbfea48 --- > > > > Since you can reproduce it, can you find out which test it is failing? > At the very least we need to add the test to fsck. > > Scott The main problem with dirbad panics is that the corruption accrued a long time ago, so a backtrace usually doesn't provide enough information to find out what went wrong. Doing a fsck _should_ fix the filesystem corruption, but only after the problem has already accrued. There are a few cases in which fsck needs to restart its current scan level or it can leave corruption inside the filesystem while marking the partition clean. -DR
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?fe77c96b0601070904n57d00a21mdf94281bc812dc50>