From owner-freebsd-bugs@FreeBSD.ORG Thu May 6 17:10:25 2004 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D83316A589 for ; Thu, 6 May 2004 17:10:24 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1C22A43D54 for ; Thu, 6 May 2004 17:10:20 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) i470AJ4F021594 for ; Thu, 6 May 2004 17:10:19 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i470AJdt021593; Thu, 6 May 2004 17:10:19 -0700 (PDT) (envelope-from gnats) Date: Thu, 6 May 2004 17:10:19 -0700 (PDT) Message-Id: <200405070010.i470AJdt021593@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Russell Francis Subject: Re: kern/65801: 5.2.1 locks up with SMP kernel X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Russell Francis List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 May 2004 00:10:25 -0000 The following reply was made to PR kern/65801; it has been noted by GNATS. From: Russell Francis To: freebsd-gnats-submit@FreeBSD.org Cc: Subject: Re: kern/65801: 5.2.1 locks up with SMP kernel Date: Thu, 06 May 2004 12:55:54 -0500 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig345FFD9DA8F4463E73EA8B1B Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit > >Description: > The machine which is a dual PIII "locks up" after a day or so when it > is compiled with an SMP kernel. I have been unable to duplicate this > with a UP kernel. When I say lock up, the following things occur. > > - Screen goes black. > > - Keyboard becomes unresponsive [Numlock/capslock] don't toggle lights > on the keyboard. > > - The machine becomes unpingable and ssh into the machine no longer > > works. The machine is probably either panicking, or deadlocking due a locking error. Enable WITNESS, DDB, INVARIANTS and kernel crashdumping, and try to obtain debugging information. Leaving the machine out of X Windows may also help, because you'll see the panic message on the system console. See the Developer's Handbook and for more details. Kris, I haven't had any luck getting a core nor has the machine locked up like it had before. I have however been able to get it to drop to the kernel debugger. Here is the stack trace if that helps. It looks like a possible locking issue. lock order reversal 1st 0xc070a800 UMA lock (UMA lock) @ vm/uma_core.c:1200 2nd 0xc0c31100 system map (system map ) @ vm/vm_map.c:2210 Stack backtrace: backtrace(c066e42c,c0c31100,c0679189,c0679189,c06791e4) at backtrace+0x17 witness_lock(c0c31100,8,c06791e4,8a2,c0c310a0) at witness_lock+0x5aa _mtx_lock_flags(c0c31100,0,c06791db,8a2,c4a5f000) at _mtx_lock_flags+0x6a _vm_map_lock(c0c310a0,c06791db,8a2,c070a0c0,1) at _vm_map_lock+0x36 vm_map_remove(c0c310a0,c4a5e000,c4a5f000,d767fbf8,c05faadb) at vm_map_remove+0x30 kmem_free(c0c310a0,c4a5e000,1000,d767fc28,c05fa4ef) at kmem_free+0x32 page_free(c4a5e000,1000,2,0,c4a5e000) at page_free+0x3b zone_drain(c0c2b380,0,c067a289,4b0,0) at zone_drain+0x2cf zone_foreach(c05fa220,d767fcf0,c05f7aa6,c067a191,0) at zone_foreach+0x45 uma_reclaim(c067a191,0,c067a13a,29e,c06e1bc0) at uma_reclaim+0x17 vm_pageout_scan(0,0,c067a13a,5a9,1f4) at vm_pageout_scan+0xf6 vm_pageout(0,d767fd48,c066a061,311,0) at vm_pageout+0x31b fork_exit(c05f8840,0,d767fd48) at fork_exit+0x7e fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xd767fd7c, ebp = 0 --- Debugger("witness_lock") Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0 -------------------------------------------------------------------------- Looking at the output from dmesg also revealed this lock order reversal 1st 0xc070a800 UMA lock (UMA lock) @ vm/uma_core.c:1200 2nd 0xc0c31100 system map (system map) @ vm/vm_map.c:2210 Stack backtrace: psmintr: delay too long; resetting byte count drm0: mem 0xd7000000-0xd77fffff,0xd6000000-0xd6003fff,0xd4000000-0xd5ffffff irq 9 at device 0.0 on pci1 info: [drm] AGP at 0xd0000000 64MB info: [drm] Initialized mga 3.1.0 20021029 on minor 0 drm0: [MPSAFE] lock order reversal 1st 0xc4678108 vm object (vm object) @ vm/swap_pager.c:1323 2nd 0xc0709c80 swap_pager swhash (swap_pager swhash) @ vm/swap_pager.c:1838 3rd 0xc0c358c4 vm object (vm object) @ vm/uma_core.c:873 Stack backtrace: I hope this helps a little, I am still trying to get a core ... Thanks, Russell Francis --------------enig345FFD9DA8F4463E73EA8B1B Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFAmnwzl8gE/LToDToRAgbaAJ4wOm2cK42rN5XrGnTvg9oufXkr1QCdFuTr mR7/zFXm1r0I0oIkVHaH4Yk= =RFZP -----END PGP SIGNATURE----- --------------enig345FFD9DA8F4463E73EA8B1B--