From owner-freebsd-current@FreeBSD.ORG Sat Jan 7 17:44:58 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3FB2116A422; Sat, 7 Jan 2006 17:44:58 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id EB75B43D70; Sat, 7 Jan 2006 17:44:53 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.11] (junior.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id k07Hip4k000559; Sat, 7 Jan 2006 10:44:51 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <43BFFE1D.4070502@samsco.org> Date: Sat, 07 Jan 2006 10:45:01 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20051230 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Rhodus References: <20060102222723.GA1754@dragon.NUXI.org> <43BA9C5C.9010307@samsco.org> <20060106200009.GA53067@garage.freebsd.pl> <43BFF041.8070300@samsco.org> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org Cc: freebsd-current@freebsd.org, Pawel Jakub Dawidek Subject: Re: It still here... panic: ufs_dirbad: bad dir X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Jan 2006 17:44:58 -0000 David Rhodus wrote: > On 1/7/06, Scott Long wrote: > >>Pawel Jakub Dawidek wrote: >> >> >>>On Tue, Jan 03, 2006 at 08:46:36AM -0700, Scott Long wrote: >>>+> David O'Brien wrote: >>>+> >>>+> >Just in case anyone thought the bug had been fixed... >>>+> >FreeBSD 7.0-CURRENT #531: Mon Jan 2 11:32:17 PST 2006 i386 >>>+> >panic: ufs_dirbad: bad dir >>>+> >cpuid = 1 >>>+> >KDB: stack backtrace: >>>+> >kdb_backtrace(c06c9ba1,1,c06c03c6,eae718c8,c8a91480) at 0xc053657e = kdb_backtrace+0x2e >>>+> >panic(c06c03c6,c85bf1f8,dade11,580,c06c0380) at 0xc0516618 = panic+0x128 >>>+> >ufs_dirbad(c9171bdc,580,c06c0380,0,eae7193c) at 0xc0616e4d = ufs_dirbad+0x4d >>>+> >ufs_lookup(eae719e8,c916c528,eae71bc4,c916c528,eae71a24) at 0xc06165cd = ufs_lookup+0x3ad >>>+> >VOP_CACHEDLOOKUP_APV(c06f2a80,eae719e8,eae71bc4,c8a91480,cac28d80) at 0xc068cd4e = VOP_CACHEDLOOKUP_APV+0x9e >>>+> >vfs_cache_lookup(eae71a90,eae71a90,c916c528,c916c528,eae71bc4) at 0xc057275a = vfs_cache_lookup+0xca >>>+> >VOP_LOOKUP_APV(c06f2a80,eae71a90,c8a91480,c106fc88,0) at 0xc068cc66 = VOP_LOOKUP_APV+0xa6 >>>+> >lookup(eae71b9c,0,c06b5c8e,b6,c057f7ed) at 0xc057760e = lookup+0x44e >>>+> >namei(eae71b9c,eae71b3c,60,0,c8a91480) at 0xc0576ecf = namei+0x44f >>>+> >kern_stat(c8a91480,8106f20,0,eae71c10,e0) at 0xc05863dd = kern_stat+0x3d >>>+> >stat(c8a91480,eae71d04,8,43c,c8a91480) at 0xc058636f = stat+0x2f >>>+> >syscall(3b,3b,3b,80dbe80,8106f20) at 0xc0682b43 = syscall+0x323 >>>+> >Xint0x80_syscall() at 0xc066d33f = Xint0x80_syscall+0x1f >>>+> >>>+> Please include the console printf that is right about the panic message. >>>+> It will say either something about a mangled entry or an isize too >>>+> small. Since this problem is happening consistently for you, but there >>>+> seem to be no other problem reports from others, I'd highly suspect that >>>+> you have filesystem damage that isn't getting detected by fsck. I assume that you are running fsck in the foreground and not in the background, yes? The easiest solution >>>+> here might be to figure out which >>>+> directory is causing the problem, and just clri its inode and then clean >>>+> up the mess. >>> >>>I'm able to reproduce it with newly newfs(8)ed file system: >>> >>>/mnt: bad dir ino 17382405 at offset 0: mangled entry >>>panic: ufs_dirbad: bad dir >>>KDB: enter: panic >>>[...] >>>db> tr >>>Tracing pid 427 tid 100057 td 0xc7ccaa80 >>>kdb_enter(c060029a,c065c020,c0610849,f6b228c0,100) at kdb_enter+0x30 >>>panic(c0610849,c7914210,1093c05,0,c0610803) at panic+0xce >>>ufs_dirbad(cb2b4b58,0,c0610803,0,f6b22934) at ufs_dirbad+0x4e >>>ufs_lookup(f6b229e4,c061b519,cb092c60,cb092c60,f6b22b64) at ufs_lookup+0x39f >>>VOP_CACHEDLOOKUP_APV(c063a7e0,f6b229e4,f6b22b64,c7ccaa80,c7d52b80) at VOP_CACHEDLOOKUP_APV+0xc4 >>>vfs_cache_lookup(f6b22a8c,f6b22a8c,0,cb092c60,0) at vfs_cache_lookup+0xc8 >>>VOP_LOOKUP_APV(c063a7e0,f6b22a8c,c7ccaa80,38,0) at VOP_LOOKUP_APV+0xa6 >>>lookup(f6b22b3c,0,c060880c,b5,c0511d45) at lookup+0x454 >>>namei(f6b22b3c,f6b22b8c,60,0,c7ccaa80) at namei+0x441 >>>kern_lstat(c7ccaa80,8059800,0,f6b22c10,2) at kern_lstat+0x5b >>>lstat(c7ccaa80,f6b22d04,8,43c,c065c740) at lstat+0x2f >>>syscall(805003b,807003b,bfbf003b,805f19c,bfbfeba0) at syscall+0x325 >>>Xint0x80_syscall() at Xint0x80_syscall+0x1f >>>--- syscall (190, FreeBSD ELF32, lstat), eip = 0x28176efb, esp = 0xbfbfe90c, ebp = 0xbfbfea48 --- >>> >> >>Since you can reproduce it, can you find out which test it is failing? >>At the very least we need to add the test to fsck. >> >>Scott > > > The main problem with dirbad panics is that the corruption accrued a > long time ago, so a backtrace usually doesn't provide enough > information to find out what went wrong. > > Doing a fsck _should_ fix the filesystem corruption, but only after > the problem has already accrued. There are a few cases in which fsck > needs to restart its current scan level or it can leave corruption > inside the filesystem while marking the partition clean. > > -DR Yes, I'm well aware of all of this, that's why I'm asking Pawel to determine which test is failing so we can find out why fsck isn't catching it. Scott