From owner-freebsd-stable@FreeBSD.ORG Tue Jun 7 23:30:56 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 63B3416A41C for ; Tue, 7 Jun 2005 23:30:56 +0000 (GMT) (envelope-from mgrooms@seton.org) Received: from zixvpm01.seton.org (zixvpm01.seton.org [207.193.126.161]) by mx1.FreeBSD.org (Postfix) with ESMTP id AAC0943D1D for ; Tue, 7 Jun 2005 23:30:55 +0000 (GMT) (envelope-from mgrooms@seton.org) Received: from zixvpm01.seton.org (ZixVPM [127.0.0.1]) by Outbound.seton.org (Proprietary) with ESMTP id BE54A3600A7 for ; Tue, 7 Jun 2005 18:30:54 -0500 (CDT) Received: from mx1-out.seton.org (unknown [10.21.254.249]) by zixvpm01.seton.org (Proprietary) with ESMTP id 5E62F330057; Tue, 7 Jun 2005 18:30:54 -0500 (CDT) Received: from localhost (unknown [127.0.0.1]) by mx1-out.seton.org (Postfix) with ESMTP id 51D388014E25; Tue, 7 Jun 2005 18:30:54 -0500 (CDT) Received: from mx1-out.seton.org ([10.21.254.249]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 12847-45; Tue, 7 Jun 2005 18:30:54 -0500 (CDT) Received: from ausexfe01.seton.org (ausexfe01.seton.org [10.20.10.211]) by mx1-out.seton.org (Postfix) with ESMTP id 383DF8014E24; Tue, 7 Jun 2005 18:30:54 -0500 (CDT) Received: from [10.20.160.190] ([10.20.160.190]) by ausexfe01.seton.org with Microsoft SMTPSVC(6.0.3790.211); Tue, 7 Jun 2005 18:30:54 -0500 Message-ID: <42A62F52.10705@seton.org> Date: Tue, 07 Jun 2005 18:35:46 -0500 From: Matthew Grooms Organization: Seton Healthcare Network User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Palle Girgensohn References: <28FCC7CB4CF6EA43AF83BCA2096E97D013E555@AUSEX2VS1.seton.org> <20050606235703.GA1106@xor.obsecurity.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-OriginalArrivalTime: 07 Jun 2005 23:30:54.0071 (UTC) FILETIME=[F2BF6070:01C56BB8] X-Virus-Scanned: by amavisd-new at seton.org Cc: max@love2party.net, glebius@freebsd.org, freebsd-stable@freebsd.org, Kris Kennaway Subject: Re: 5.4-RELEASE lockups on amd64 SMP X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2005 23:30:56 -0000 Palle, Its a dell 2850 w/ Dual CPU, AMR controller and 6x em devices ( 2x on board and 4x Intel Pro 1000 MT ). If you still want the full dmesg output, reply and I will send it to you. I suspect the issues I am seeing are related to a SMP locking deficiency in pf/pfsync. I have a second identical system that has been running off and on for a couple of weeks now as the pfsync peer that hasn't hiccuped once. Its kernel is compiled without the SMP option. Once again, here are the backtraces for the panic and lor ... Tracing id 110 tid 100089 td 0xffffff012f3f0c80 kdb_enter() at kdb_enter+0x2f panic() at panic+0x249 uma_dbg_free() at uma_dbg_free+0x188 uma_zfree_arg() at uma_zfree_arg+0x1b0 pf_purge_expired_states() at pf_purge_expired_states+0x41 pfsync_input at pfsync_input+xb35 pf_input() at ip_input+0x10f netisr_processqueue() at netisr_processqueue+0x17 swi_net() at swi_net+0xa8 ithread_loop() at ithread_loop+0xd9 fork_exit() at fork_exit+0xc3 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffffb44f9d00, rbp = 0 --- db> continue boot() called on cpu#0 Uptime: 13h42m43s Dumping 4864 MB 16 32 ... lock order reversal 1st 0xffffffff80752ec0 pf task mtx (pf task mtx) @ contrib/pf/net/if_pfsync.c:1621 2nd 0xffffffff8076e9f0 user map (user man) @ vm/vm_map.c:2998 KDB: stack backtrace: witness_checkorder() at witness_checkorder+0x654 _sx_xlock() at _sx_xlock+0x51 vm_map_lookup() at vm_map_lookup+0x44 vm_fault() at vm_fault+0xba trap() at trap+0x1c5 alltraps_with_regs_pushed() at alltraps_with_regs_pushed+0x5 pf_state_tree_lan_ext_RB_REMOVE() at pf_state_tree_lan_ext_RB_REMOVE+0x10c pf_purge_expired_states() at pf_purge_expired_states+0xab pfsync_input() at ip_input+0x10f netisr_processqueue() at netisr_processqueue+0x17 swi_net() at swi_net+0xa8 ithread_loop() at ithread_loop+0xd9 fork_exit() at fork_exit+0xc3 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffffb44f9d00, rbp = 0 --- KDB: enter: withness_ckeckorder [thread pid 110 tid 100089] Stopped at kdb_enter+0x2f: nop db> panic blockable sleep lock (sleep mutex) tty @ kern/kern_event.c:1453 cpuid = 0 boot() called on cpu#0 Uptime: 10m40s Dumping 4864 mB 16 32 ... -Matthew Palle Girgensohn wrote: > --On måndag, juni 06, 2005 19.57.03 -0400 Kris Kennaway > wrote: > >> >> On Mon, Jun 06, 2005 at 06:54:05PM -0500, Grooms, Matthew wrote: >> >>> My appologies. With the debug options listed in my previous post ( >>> should have read 5.4 not 5.3 ), I got a lock order reversal. After a >>> while, it paniced and spat out this ... >>> > > Hi, > > Since I'm seeing panics with my Dell 2850 as soon as I add the second > CPU (and I'm not alone, it seems), may I ask what brand is this machine? > Can you send a dmesg? What are the ethernet devices? > > /Palle > >