From owner-freebsd-current@FreeBSD.ORG Thu Aug 4 18:40:52 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 76B2616A41F; Thu, 4 Aug 2005 18:40:52 +0000 (GMT) (envelope-from alc@cs.rice.edu) Received: from cs.rice.edu (cs.rice.edu [128.42.1.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 25B3643D48; Thu, 4 Aug 2005 18:40:52 +0000 (GMT) (envelope-from alc@cs.rice.edu) Received: from localhost (calypso.cs.rice.edu [128.42.1.127]) by cs.rice.edu (Postfix) with ESMTP id 236984A9A5; Thu, 4 Aug 2005 13:40:51 -0500 (CDT) Received: from cs.rice.edu ([128.42.1.30]) by localhost (calypso.cs.rice.edu [128.42.1.127]) (amavisd-new, port 10024) with LMTP id 09552-01-91; Thu, 4 Aug 2005 13:40:50 -0500 (CDT) Received: by cs.rice.edu (Postfix, from userid 19572) id 73C254A9A3; Thu, 4 Aug 2005 13:40:50 -0500 (CDT) Date: Thu, 4 Aug 2005 13:40:50 -0500 From: Alan Cox To: Robert Watson Message-ID: <20050804184050.GA18131@cs.rice.edu> References: <2C087707-319D-44BE-B770-89B7AF3CBD96@tamu.edu> <20050804111929.I23885@fledge.watson.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050804111929.I23885@fledge.watson.org> User-Agent: Mutt/1.4.2i X-Virus-Scanned: by amavis-2.2.1 at cs.rice.edu Cc: alc@FreeBSD.org, "R. Tyler Ballance" , freebsd-current@freebsd.org Subject: Re: Kernel spewing errors/warnings on 7.0-CURRENT X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Aug 2005 18:40:52 -0000 On Thu, Aug 04, 2005 at 11:21:33AM +0100, Robert Watson wrote: > On Thu, 4 Aug 2005, R. Tyler Ballance wrote: > > >After restarting my FreeBSD-CURRENT workstation, shortly after I logged > >in, and shortly before bg_fsck had completely (power outage last night) > >I got the following set of messages spewed to ttyv0 : > > This is a non-fatal warning of a lock order issue between UMA and the VM > system. I.e., you don't need to worry per se, but we do need to fix the > source of the problem. I haven't had a chance to investigate what commit > started the recent spate of these, but I'm guessing a new lock is either > acquired in the vm_pageout code, or in the vm_map code, and UMA sits in > the middle. This is part of the cyclic dependency between UMA an VM. > UMA is acquiring its lock so it can safely walk and drain the zone list > without zones disappearing, etc. Off the top my head, I can't think of any recent locking changes that could have this effect. That aside, the lock order below is the one that I would describe as correct, or at least expected. Can you hardwire this ordering into witness and see where it is being violated. I hypothesize that it is startup_alloc(). Regards, Alan > > > >Aug 3 16:19:17 workstation kernel: lock order reversal > >Aug 3 16:19:17 workstation kernel: 1st 0xc097ab20 UMA lock (UMA lock) @ > >/usr/src/sys/vm/uma_core.c:1491 > >Aug 3 16:19:17 workstation kernel: 2nd 0xc1060144 system map (system map) > >@ /usr/src/sys/vm/vm_map.c:2317 > >Aug 3 16:19:17 workstation kernel: KDB: stack backtrace: > >Aug 3 16:19:17 workstation kernel: kdb_backtrace > >(0,ffffffff,c092fc38,c092fd78,c08ba624) at kdb_backtrace+0x29 > >Aug 3 16:19:17 workstation kernel: witness_checkorder > >(c1060144,9,c0870d80,90d) at witness_checkorder+0x564 > >Aug 3 16:19:17 workstation kernel: > >_mtx_lock_flags(c1060144,0,c0870d80,90d) at _mtx_lock_flags+0x5b > >Aug 3 16:19:17 workstation kernel: _vm_map_lock(c10600c0,c0870d80,90d) at > >_vm_map_lock+0x26 > >Aug 3 16:19:17 workstation kernel: vm_map_remove > >(c10600c0,c1d0a000,c1d0b000,d56ecc08,c077e4e1) at vm_map_remove+0x1f > >Aug 3 16:19:17 workstation kernel: kmem_free > >(c10600c0,c1d0a000,1000,d56ecc38,c077de8e) at kmem_free+0x25 > >Aug 3 16:19:17 workstation kernel: page_free(c1d0a000,1000,2) at > >page_free+0x29 > >Aug 3 16:19:17 workstation kernel: zone_drain(c104a960) at > >zone_drain+0x26a > >Aug 3 16:19:17 workstation kernel: zone_foreach > >(c077dc24,d56eccec,c078fcaf,c1a3d900,d56ecc74) at zone_foreach+0x37 > >Aug 3 16:19:17 workstation kernel: uma_reclaim > >(c1a3d900,d56ecc74,0,c0926260,d56ecc80) at uma_reclaim+0x12 > >Aug 3 16:19:17 workstation kernel: vm_pageout_scan > >(0,c097af80,0,c087226d,5c3) at vm_pageout_scan+0x103 > >Aug 3 16:19:17 workstation kernel: vm_pageout(0,d56ecd38,0,c0790a68,0) at > >vm_pageout+0x2c3 > >Aug 3 16:19:17 workstation kernel: fork_exit(c0790a68,0,d56ecd38) at > >fork_exit+0xa0 > >Aug 3 16:19:17 workstation kernel: fork_trampoline() at > >fork_trampoline+0x8 > >Aug 3 16:19:17 workstation kernel: --- trap 0x1, eip = 0, esp = > >0xd56ecd6c, ebp = 0 --- > > > > > >FreeBSD workstation.local 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Sat Jul 16 > >15:09:18 CDT 2005 root@workstation.local:/usr/obj/usr/src/sys/GENERIC > >i386 > > > >***************************** > > > >This didn't cause the kernel to panic or anything, I just figured the > >kernel debugger output might be helpful to somebody here... > > > >The same lock order reversal happened the last time I did a clean reboot, > >and this time after a hard restart. > > > >dmesg output is here if needed: http://agentdero.com/~tyler/ > >dmesg.workstation.txt > > > >Is this nothing to worry about? Or is something going slightly wrong? > > > >Cheers, > > > >-R. Tyler Ballance > > > >_______________________________________________ > >freebsd-current@freebsd.org mailing list > >http://lists.freebsd.org/mailman/listinfo/freebsd-current > >To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > >