From owner-freebsd-stable@FreeBSD.ORG Sat Feb 19 10:37:16 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 852ED16A4CE for ; Sat, 19 Feb 2005 10:37:16 +0000 (GMT) Received: from farside.isc.org (farside.isc.org [204.152.187.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 30DBD43D1D for ; Sat, 19 Feb 2005 10:37:16 +0000 (GMT) (envelope-from Peter_Losher@isc.org) Received: from [10.0.0.7] (c-24-4-233-31.client.comcast.net [24.4.233.31]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by farside.isc.org (Postfix) with ESMTP id DC10F677F4; Sat, 19 Feb 2005 10:37:15 +0000 (UTC) (envelope-from Peter_Losher@isc.org) Message-ID: <4217170A.2030106@isc.org> Date: Sat, 19 Feb 2005 02:38:02 -0800 From: Peter Losher User-Agent: Mozilla Thunderbird 1.0 (X11/20050210) X-Accept-Language: en-us, en MIME-Version: 1.0 To: stable@freebsd.org X-Enigmail-Version: 0.89.6.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig0B2F3CFB27F1E4FA4D6ECF19" Subject: Hard lockups using 5.3-RELEASE.. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Feb 2005 10:37:16 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig0B2F3CFB27F1E4FA4D6ECF19 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit We have a Celestica dual-Opteron system w/ 4GB RAM running 5.3-RELEASE/i386 (32-bit), and a SMP-aware kernel, which is experiencing hard lockups. Debugging results below. -=- [BREAK] KDB: enter: Line break on console [thread 100104] Stopped at kdb_enter+0x2b: nop db> where kdb_enter(c084e4c6) at kdb_enter+0x2b siointr1(c507d800,c0946700,0,c084e28e,6ad) at siointr1+0xce siointr(c507d800) at siointr+0x21 intr_execute_handlers(c4f5d490,e9826b80,4,e9826bd0,c07b2ae3) at intr_execute_han dlers+0x89 lapic_handle_intr(34) at lapic_handle_intr+0x2e Xapic_isr1() at Xapic_isr1+0x33 --- interrupt, eip = 0xc0604456, esp = 0xe9826bc4, ebp = 0xe9826bd0 --- _mtx_lock_sleep(c08f67c0,c5698640,0,c084a0b3,126) at _mtx_lock_sleep+0xc6 _mtx_lock_flags(c08f67c0,0,c084a0b3,126,c6a82738) at _mtx_lock_flags+0x48 vm_fault(c5bbd5dc,81ae000,2,8,c5698640) at vm_fault+0x1fe trap_pfault(e9826d48,1,81ae000,81ae000,0) at trap_pfault+0xf2 trap(2f,2f,2f,2000,81ae000) at trap+0x1df calltrap() at calltrap+0x5 --- trap 0xc, eip = 0x2809bd8d, esp = 0xbfbfb7b0, ebp = 0xbfbfb7e8 --- db> panic panic: from debugger cpuid = 3 boot() called on cpu#3 Uptime: 2h50m29s -=- (then resetting the system causes a panic, and the system locks up for good, and a power reset is required) We were able to get a coredump, and the resulting kgdb output is below: -=- (kgdb) up #45 0xc05f9bda in fork_exit (callout=0xc05fa5dc , arg=0xc4fe7a00, frame=0xe8daed48) at ../../../kern/kern_fork.c:811 811 callout(arg, frame); (kgdb) l 806 * cpu_set_fork_handler intercepts this function call to 807 * have this call a non-return function to stay in kernel mode. 808 * initproc has its own fork handler, but it does return. 809 */ 810 KASSERT(callout != NULL, ("NULL callout in fork_exit")); 811 callout(arg, frame); 812 813 /* 814 * Check if a kernel thread misbehaved and returned from its main 815 * function. (kgdb) down #44 0xc05fa6e8 in ithread_loop (arg=0xc4fe7a00) at ../../../kern/kern_intr.c:547 547 ih->ih_handler(ih->ih_argument); (kgdb) l 542 mtx_unlock(&ithd->it_lock); 543 goto restart; 544 } 545 if ((ih->ih_flags & IH_MPSAFE) == 0) 546 mtx_lock(&Giant); 547 ih->ih_handler(ih->ih_argument); 548 if ((ih->ih_flags & IH_MPSAFE) == 0) 549 mtx_unlock(&Giant); 550 } 551 if (ithd->it_enable != NULL) { (kgdb) down #43 0xc0615dfa in softclock (dummy=0x0) at ../../../kern/kern_timeout.c:247 247 mtx_lock(&Giant); (kgdb) l 242 (c->c_flags & ~CALLOUT_PENDING); 243 } 244 curr_callout = c; 245 mtx_unlock_spin(&callout_lock); 246 if (!(c_flags & CALLOUT_MPSAFE)) { 247 mtx_lock(&Giant); 248 gcalls++; 249 CTR1(KTR_CALLOUT, "callout %p", c_func); 250 } else { 251 mpcalls++; -=- It looks like it's trying to lock Giant while it already has Giant. In any case, we have rebuilt a uniprocessor kernel for now. If this is already fixed in 5-STABLE, then let me know. ;) Best Wishes - Peter -- Peter_Losher@isc.org | ISC | OpenPGP 0xE8048D08 | "The bits must flow" --------------enig0B2F3CFB27F1E4FA4D6ECF19 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCFxcKPtVx9OgEjQgRAgL9AKCIXd53Sk3yKVqyCh88i5Q2gyyvIwCgk08t rTKc8W8PPiVVentcdIu1FXE= =9rj0 -----END PGP SIGNATURE----- --------------enig0B2F3CFB27F1E4FA4D6ECF19--