From owner-freebsd-stable@FreeBSD.ORG Thu Mar 20 11:19:23 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2B59C106564A for ; Thu, 20 Mar 2008 11:19:23 +0000 (UTC) (envelope-from pitney.brad@googlemail.com) Received: from rv-out-0910.google.com (rv-out-0910.google.com [209.85.198.191]) by mx1.freebsd.org (Postfix) with ESMTP id 18CE98FC1A for ; Thu, 20 Mar 2008 11:19:22 +0000 (UTC) (envelope-from pitney.brad@googlemail.com) Received: by rv-out-0910.google.com with SMTP id g13so539675rvb.43 for ; Thu, 20 Mar 2008 04:19:22 -0700 (PDT) Received: by 10.141.70.18 with SMTP id x18mr605932rvk.284.1206011962603; Thu, 20 Mar 2008 04:19:22 -0700 (PDT) Received: by 10.141.41.8 with HTTP; Thu, 20 Mar 2008 04:19:22 -0700 (PDT) Message-ID: <3dd203290803200419w3565bf3fpdf499ea96f11cb86@mail.gmail.com> Date: Thu, 20 Mar 2008 11:19:22 +0000 From: "Brad Pitney" To: "Kris Kennaway" In-Reply-To: <47E23A7F.3020807@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3dd203290803192039y2f905ae1m36833978a2799e29@mail.gmail.com> <47E23A7F.3020807@FreeBSD.org> Cc: daichi@freebsd.org, freebsd-stable@freebsd.org Subject: Re: machine wedged -> KDB: enter: lock violation X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Mar 2008 11:19:23 -0000 On Thu, Mar 20, 2008 at 10:20 AM, Kris Kennaway wrote: > > Brad Pitney wrote: > > Not sure why it keeps wedging, at first I thought it was something to > > do with the LORs, now after adding some more debugging options I > > think I might have found the answer! > > > > KDB: stack backtrace: > > db_trace_self_wrapper(c074b5ee,e70599ac,c05b6853,c4a9e000,e70599ac,...) > > at db_trace_self_wrapper+0x26 > > kdb_backtrace(c4a9e000,e70599ac,c07025c5,e70599bc,c4c44d98,...) at > > kdb_backtrace+0x29 > > vfs_badlock(c4a37900,e70599bc,c07b00a0,c4c44d98,c4a9e000) at vfs_badlock+0x23 > > assert_vop_elocked(c4c44d98,c0752ee7,c4a9e000,1b9,0,...) at > > assert_vop_elocked+0x53 > > cache_lookup(c4c4815c,e7059bc0,e7059bd4,e7059bc0,c4aa4400,...) at > > cache_lookup+0x53c > > vfs_cache_lookup(e7059aa8,c07545ba,c4c4815c,2,c4c4815c,...) at > > vfs_cache_lookup+0xaa > > VOP_LOOKUP_APV(c4a37900,e7059aa8,c4a9e000,c075356a,19b,...) at > > VOP_LOOKUP_APV+0xe5 > > lookup(e7059bac,e7059ae8,c6,bf,c4aa542c,...) at lookup+0x53e > > namei(e7059bac,2,c0754d92,c0577808,c0811ae0,...) at namei+0x28e > > kern_stat(c4a9e000,2820258c,0,e7059c1c,c074d152,...) at kern_stat+0x3d > > stat(c4a9e000,e7059cfc,8,c074e1dc,c0785e00,...) at stat+0x2f > > syscall(e7059d38) at syscall+0x273 > > Xint0x80_syscall() at Xint0x80_syscall+0x20 > > --- syscall (188, FreeBSD ELF32, stat), eip = 0x281aa48f, esp = > > 0xbfbfea4c, ebp = 0xbfbfeae8 --- > > cache_lookup: 0xc4c44d98 is not exclusive locked but should be > > KDB: enter: lock violation > > > > Locked vnodes > > [...] > > Apparently 0xc4c44d98 is not locked at all, it didnt appear in your > list. Are you sure that was all of it? What does 'show vnode > 0xc4c44d98' report? > it's possible it isn't all of it :( - is it the only other information that might be needed if it happens again? which is highly likely, I've had to reboot the box about 3 times a day on average. Worst part is it never happens when I am logged in to the box, grr. my unionfs mount looks like this: :/var/jail/nub01 on /var/jail/nub02 (unionfs, local, noatime) I do have another problem with devfs, but might be related to unionfs Starting jails: nub01 devfs ruleset: ioctl DEVFSIO_SUSE : Inappropriate ioctl for device /etc/rc: WARNING: devfs_set_ruleset: unable to set ruleset 4 to /var/jail/nub02/dev devfs rule: ioctl DEVFSIO_SAPPLY : Inappropriate ioctl for device nub02 . > This is likely to be a unionfs bug. Ok, I can remake the jail without using uniionfs. Strange how it worked no problem before. Could the jails being out of sync cause the problem? when I say out of sync, I mean they are still from code that was from September 2007, although from discussions on the mailing lists, I think you can get away with running RELENG_6 under jail without problems. > > Kris > -- Best regards, Brad