From owner-freebsd-amd64@FreeBSD.ORG Thu May 15 11:40:01 2014 Return-Path: Delivered-To: freebsd-amd64@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 38A1053F for ; Thu, 15 May 2014 11:40:01 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 19B372E10 for ; Thu, 15 May 2014 11:40:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.8/8.14.8) with ESMTP id s4FBe09Z065595 for ; Thu, 15 May 2014 11:40:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s4FBe0xh065594; Thu, 15 May 2014 11:40:00 GMT (envelope-from gnats) Date: Thu, 15 May 2014 11:40:00 GMT Message-Id: <201405151140.s4FBe0xh065594@freefall.freebsd.org> To: freebsd-amd64@FreeBSD.org Cc: From: John Baldwin Subject: Re: amd64/189741: 9/STABLE panic at em_msix_rx w/ em(4) + PF Reply-To: John Baldwin X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 May 2014 11:40:01 -0000 The following reply was made to PR amd64/189741; it has been noted by GNATS. From: John Baldwin To: Nick Rogers , freebsd-gnats-submit@FreeBSD.org Cc: Gleb Smirnoff Subject: Re: amd64/189741: 9/STABLE panic at em_msix_rx w/ em(4) + PF Date: Thu, 15 May 2014 07:32:53 -0400 On 5/12/14, 7:43 PM, Nick Rogers wrote: > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 5; apic id = 05 > fault virtual address = 0x10 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff8033d350 > stack pointer = 0x28:0xffffff83545384b0 > frame pointer = 0x28:0xffffff83545384c0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 12 (irq262: em2:rx 0) > trap number = 12 > panic: page fault > cpuid = 5 > KDB: stack backtrace: > #0 0xffffffff80956836 at kdb_backtrace+0x66 > #1 0xffffffff8091c40e at panic+0x1ce > #2 0xffffffff80d31e70 at trap_fatal+0x290 > #3 0xffffffff80d321d1 at trap_pfault+0x211 > #4 0xffffffff80d327d3 at trap+0x363 > #5 0xffffffff80d1b9d3 at calltrap+0x8 > #6 0xffffffff8034872d at pf_test_rule+0x17ed > #7 0xffffffff8034ba12 at pf_test+0x1032 > #8 0xffffffff8035112b at pf_check_in+0x2b > #9 0xffffffff809e952e at pfil_run_hooks+0x9e > #10 0xffffffff80a5286a at ip_input+0x2ea > #11 0xffffffff809e8858 at netisr_dispatch_src+0x218 > #12 0xffffffff809df93d at ether_demux+0x14d > #13 0xffffffff809dfc1e at ether_nh_input+0x1fe > #14 0xffffffff809e8858 at netisr_dispatch_src+0x218 > #15 0xffffffff809df85f at ether_demux+0x6f > #16 0xffffffff809dfc1e at ether_nh_input+0x1fe > #17 0xffffffff809e8858 at netisr_dispatch_src+0x218 > Uptime: 17d7h20m59s > Dumping 2932 out of 12256 MB: (CTRL-C to abort) > .1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Reading symbols from /boot/kernel/aio.ko...Reading symbols from > /boot/kernel/aio.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/aio.ko > Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from > /boot/kernel/coretemp.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/coretemp.ko > Reading symbols from /boot/kernel/cc_htcp.ko...Reading symbols from > /boot/kernel/cc_htcp.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/cc_htcp.ko > #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:234 > 234 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) list *0xffffffff8033d350 > 0xffffffff8033d350 is in pf_addrcpy (/usr/src/sys/contrib/pf/net/pf.c:512). > 507 pf_addrcpy(struct pf_addr *dst, struct pf_addr *src, sa_family_t af) > 508 { > 509 switch (af) { > 510 #ifdef INET > 511 case AF_INET: > 512 dst->addr32[0] = src->addr32[0]; > 513 break; > 514 #endif /* INET */ > 515 case AF_INET6: > 516 dst->addr32[0] = src->addr32[0]; > (kgdb) backtrace > #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:234 > #1 0xffffffff8091bee6 in kern_reboot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:454 > #2 0xffffffff8091c3e7 in panic (fmt=0x1
) > at /usr/src/sys/kern/kern_shutdown.c:642 > #3 0xffffffff80d31e70 in trap_fatal (frame=0xc, eva=Variable "eva" is > not available. > ) at /usr/src/sys/amd64/amd64/trap.c:878 > #4 0xffffffff80d321d1 in trap_pfault (frame=0xffffff8354538400, > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:794 > #5 0xffffffff80d327d3 in trap (frame=0xffffff8354538400) at > /usr/src/sys/amd64/amd64/trap.c:456 > #6 0xffffffff80d1b9d3 in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:232 > #7 0xffffffff8033d350 in pf_addrcpy (dst=0xfffffe010c6416b8, > src=0x10, af=2 '\002') at /usr/src/sys/contrib/pf/net/pf.c:522 A 'src' pointer of 0x10 here would explain the crash (and is consistent with the fault address). > #8 0xffffffff8034872d in pf_test_rule (rm=0xffffff8354538788, > sm=0xffffff8354538780, direction=1, kif=0xfffffe0007d08100, > m=0xfffffe0030555d00, off=20, h=0xfffffe0030bad00e, > pd=0xffffff83545386c0, am=0xffffff8354538790, > rsm=0xffffff8354538778, ifq=0x0, inp=0x0) at > /usr/src/sys/contrib/pf/net/pf.c:3900 This is actually in pf_create_state(), and it would seem that 'nk' would have to be NULL for this to happen. However, 'nsn' would have to be non-NULL. I think I see a possible bug that is fixed in 10. Try this: Index: 9/sys/contrib/pf/net/pf_lb.c =================================================================== --- 9/sys/contrib/pf/net/pf_lb.c (revision 266119) +++ 9/sys/contrib/pf/net/pf_lb.c (working copy) @@ -788,6 +788,7 @@ pool_put(&pf_state_key_pl, *skp); #endif *skw = *sks = *nkp = *skp = NULL; + *sn = NULL; return (NULL); } } -- John Baldwin