From owner-freebsd-amd64@FreeBSD.ORG Sat May 17 11:50:02 2014 Return-Path: Delivered-To: freebsd-amd64@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 817E53F1 for ; Sat, 17 May 2014 11:50:02 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 629962958 for ; Sat, 17 May 2014 11:50:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.8/8.14.8) with ESMTP id s4HBo2Y8007785 for ; Sat, 17 May 2014 11:50:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s4HBo27Z007784; Sat, 17 May 2014 11:50:02 GMT (envelope-from gnats) Date: Sat, 17 May 2014 11:50:02 GMT Message-Id: <201405171150.s4HBo27Z007784@freefall.freebsd.org> To: freebsd-amd64@FreeBSD.org Cc: From: John Baldwin Subject: Re: amd64/189741: 9/STABLE panic at em_msix_rx w/ em(4) + PF Reply-To: John Baldwin X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 May 2014 11:50:02 -0000 The following reply was made to PR amd64/189741; it has been noted by GNATS. From: John Baldwin To: Nick Rogers Cc: freebsd-gnats-submit@freebsd.org, Gleb Smirnoff Subject: Re: amd64/189741: 9/STABLE panic at em_msix_rx w/ em(4) + PF Date: Sat, 17 May 2014 07:43:26 -0400 On 5/16/14, 10:51 AM, Nick Rogers wrote: > On Thu, May 15, 2014 at 4:32 AM, John Baldwin wrote: >> On 5/12/14, 7:43 PM, Nick Rogers wrote: >>> GNU gdb 6.1.1 [FreeBSD] >>> Copyright 2004 Free Software Foundation, Inc. >>> GDB is free software, covered by the GNU General Public License, and you are >>> welcome to change it and/or distribute copies of it under certain conditions. >>> Type "show copying" to see the conditions. >>> There is absolutely no warranty for GDB. Type "show warranty" for details. >>> This GDB was configured as "amd64-marcel-freebsd"... >>> >>> Unread portion of the kernel message buffer: >>> >>> >>> Fatal trap 12: page fault while in kernel mode >>> cpuid = 5; apic id = 05 >>> fault virtual address = 0x10 >>> fault code = supervisor read data, page not present >>> instruction pointer = 0x20:0xffffffff8033d350 >>> stack pointer = 0x28:0xffffff83545384b0 >>> frame pointer = 0x28:0xffffff83545384c0 >>> code segment = base 0x0, limit 0xfffff, type 0x1b >>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>> processor eflags = interrupt enabled, resume, IOPL = 0 >>> current process = 12 (irq262: em2:rx 0) >>> trap number = 12 >>> panic: page fault >>> cpuid = 5 >>> KDB: stack backtrace: >>> #0 0xffffffff80956836 at kdb_backtrace+0x66 >>> #1 0xffffffff8091c40e at panic+0x1ce >>> #2 0xffffffff80d31e70 at trap_fatal+0x290 >>> #3 0xffffffff80d321d1 at trap_pfault+0x211 >>> #4 0xffffffff80d327d3 at trap+0x363 >>> #5 0xffffffff80d1b9d3 at calltrap+0x8 >>> #6 0xffffffff8034872d at pf_test_rule+0x17ed >>> #7 0xffffffff8034ba12 at pf_test+0x1032 >>> #8 0xffffffff8035112b at pf_check_in+0x2b >>> #9 0xffffffff809e952e at pfil_run_hooks+0x9e >>> #10 0xffffffff80a5286a at ip_input+0x2ea >>> #11 0xffffffff809e8858 at netisr_dispatch_src+0x218 >>> #12 0xffffffff809df93d at ether_demux+0x14d >>> #13 0xffffffff809dfc1e at ether_nh_input+0x1fe >>> #14 0xffffffff809e8858 at netisr_dispatch_src+0x218 >>> #15 0xffffffff809df85f at ether_demux+0x6f >>> #16 0xffffffff809dfc1e at ether_nh_input+0x1fe >>> #17 0xffffffff809e8858 at netisr_dispatch_src+0x218 >>> Uptime: 17d7h20m59s >>> Dumping 2932 out of 12256 MB: (CTRL-C to abort) >>> .1%..11%..21%..31%..41%..51%..61%..71%..81%..91% >>> >>> Reading symbols from /boot/kernel/aio.ko...Reading symbols from >>> /boot/kernel/aio.ko.symbols...done. >>> done. >>> Loaded symbols for /boot/kernel/aio.ko >>> Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from >>> /boot/kernel/coretemp.ko.symbols...done. >>> done. >>> Loaded symbols for /boot/kernel/coretemp.ko >>> Reading symbols from /boot/kernel/cc_htcp.ko...Reading symbols from >>> /boot/kernel/cc_htcp.ko.symbols...done. >>> done. >>> Loaded symbols for /boot/kernel/cc_htcp.ko >>> #0 doadump (textdump=Variable "textdump" is not available. >>> ) at pcpu.h:234 >>> 234 pcpu.h: No such file or directory. >>> in pcpu.h >>> (kgdb) list *0xffffffff8033d350 >>> 0xffffffff8033d350 is in pf_addrcpy (/usr/src/sys/contrib/pf/net/pf.c:512). >>> 507 pf_addrcpy(struct pf_addr *dst, struct pf_addr *src, sa_family_t af) >>> 508 { >>> 509 switch (af) { >>> 510 #ifdef INET >>> 511 case AF_INET: >>> 512 dst->addr32[0] = src->addr32[0]; >>> 513 break; >>> 514 #endif /* INET */ >>> 515 case AF_INET6: >>> 516 dst->addr32[0] = src->addr32[0]; >>> (kgdb) backtrace >>> #0 doadump (textdump=Variable "textdump" is not available. >>> ) at pcpu.h:234 >>> #1 0xffffffff8091bee6 in kern_reboot (howto=260) at >>> /usr/src/sys/kern/kern_shutdown.c:454 >>> #2 0xffffffff8091c3e7 in panic (fmt=0x1
) >>> at /usr/src/sys/kern/kern_shutdown.c:642 >>> #3 0xffffffff80d31e70 in trap_fatal (frame=0xc, eva=Variable "eva" is >>> not available. >>> ) at /usr/src/sys/amd64/amd64/trap.c:878 >>> #4 0xffffffff80d321d1 in trap_pfault (frame=0xffffff8354538400, >>> usermode=0) at /usr/src/sys/amd64/amd64/trap.c:794 >>> #5 0xffffffff80d327d3 in trap (frame=0xffffff8354538400) at >>> /usr/src/sys/amd64/amd64/trap.c:456 >>> #6 0xffffffff80d1b9d3 in calltrap () at >>> /usr/src/sys/amd64/amd64/exception.S:232 >>> #7 0xffffffff8033d350 in pf_addrcpy (dst=0xfffffe010c6416b8, >>> src=0x10, af=2 '\002') at /usr/src/sys/contrib/pf/net/pf.c:522 >> >> A 'src' pointer of 0x10 here would explain the crash (and is consistent >> with the fault address). >> >>> #8 0xffffffff8034872d in pf_test_rule (rm=0xffffff8354538788, >>> sm=0xffffff8354538780, direction=1, kif=0xfffffe0007d08100, >>> m=0xfffffe0030555d00, off=20, h=0xfffffe0030bad00e, >>> pd=0xffffff83545386c0, am=0xffffff8354538790, >>> rsm=0xffffff8354538778, ifq=0x0, inp=0x0) at >>> /usr/src/sys/contrib/pf/net/pf.c:3900 >> >> This is actually in pf_create_state(), and it would seem that 'nk' would >> have to be NULL for this to happen. However, 'nsn' would have >> to be non-NULL. >> >> I think I see a possible bug that is fixed in 10. Try this: >> >> Index: 9/sys/contrib/pf/net/pf_lb.c >> =================================================================== >> --- 9/sys/contrib/pf/net/pf_lb.c (revision 266119) >> +++ 9/sys/contrib/pf/net/pf_lb.c (working copy) >> @@ -788,6 +788,7 @@ >> pool_put(&pf_state_key_pl, *skp); >> #endif >> *skw = *sks = *nkp = *skp = NULL; >> + *sn = NULL; >> return (NULL); >> } >> } >> > Thank you! I will give that a shot and let you know if the panic continues. I just checked and this was the fix made to HEAD in r260377 for PR 182557. It just needs to be merged. I'll try to get to that today. -- John Baldwin