From owner-freebsd-net@FreeBSD.ORG Sun Apr 8 21:19:04 2007 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3789916A401 for ; Sun, 8 Apr 2007 21:19:04 +0000 (UTC) (envelope-from mgrooms@shrew.net) Received: from shrew.net (206-223-169-85.beanfield.net [206.223.169.85]) by mx1.freebsd.org (Postfix) with ESMTP id 0536D13C483 for ; Sun, 8 Apr 2007 21:19:03 +0000 (UTC) (envelope-from mgrooms@shrew.net) Received: from localhost (206-223-169-82.beanfield.net [206.223.169.82]) by shrew.net (Postfix) with ESMTP id 3EDEA79E2D4; Sun, 8 Apr 2007 16:19:03 -0500 (CDT) Received: from shrew.net ([206.223.169.85]) by localhost (mx1.hub.org [206.223.169.82]) (amavisd-new, port 10024) with ESMTP id 72786-03; Sun, 8 Apr 2007 21:19:02 +0000 (UTC) Received: from hole.shrew.net (24-155-108-213.dyn.grandenetworks.net [24.155.108.213]) by shrew.net (Postfix) with ESMTP id B6D5979E2D3; Sun, 8 Apr 2007 16:19:01 -0500 (CDT) Received: from [10.22.200.21] ([10.22.200.21]) by hole.shrew.net (8.13.6/8.13.6) with ESMTP id l38EKpvf063540; Sun, 8 Apr 2007 14:20:51 GMT (envelope-from mgrooms@shrew.net) Message-ID: <46195CAA.2050801@shrew.net> Date: Sun, 08 Apr 2007 16:20:42 -0500 From: Matthew Grooms User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: freebsd-net@freebsd.org Content-Type: multipart/mixed; boundary="------------070209010107060903070601" Subject: Bug in FAST IPSEC pfkey interaction ... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Apr 2007 21:19:04 -0000 This is a multi-part message in MIME format. --------------070209010107060903070601 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit All, I have encountered a bug while doing some testing using the ike daemon that is bundled with the client software that I posted recently on this list. The daemon is multi threaded and tends to stress the pfkey interface a bit more than racoon by submitting batches of pfkey messages in rapid sequence when adding or removing client security policy. This triggers a page fault in netipsec/key.c key_spddelete2(). Offending code fragment ... /* Is there SP in SPD ? */ if ((sp = key_getspbyid(id)) == NULL) { ipseclog((LOG_DEBUG, "%s: no SP found id:%u.\n", __func__, id)); key_senderror(so, m, EINVAL); } sp->state = IPSEC_SPSTATE_DEAD; KEY_FREESP(&sp); ... where the function proceeds to set sp->state to IPSEC_SPSTATE_DEAD even if the sp pointer is NULL. My guess is that this was intended to read ... return key_senderror(so,m, EINVAL); ... or ... key_senderror(so, m, EINVAL); return 0; ... This could be a problem triggered by the daemon submitting multiple delete requests for a single SPD entry. I need to do a bit more digging to be certain. I just hope its not an SPD locking issue as the page fault only occurs about 1 out of every 10 attempts on my SMP system :/ I have attached some kdbg output and still have the core file lingering if anyone needs more info. Thanks, -Matthew --------------070209010107060903070601 Content-Type: text/plain; name="crash.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="crash.txt" Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xa0 fault code = supervisor write, page not present instruction pointer = 0x8:0xffffffff803ba56c stack pointer = 0x10:0xffffffffa8163740 frame pointer = 0x10:0x173 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 781 (iked) trap number = 12 panic: page fault cpuid = 0 Uptime: 4m24s Dumping 1014 MB (2 chunks) chunk 0: 1MB (156 pages) ... ok chunk 1: 1014MB (259472 pages) 998 982 966 950 934 918 902 886 870 854 838 822 806 790 774 758 742 7 26 710 694 678 662 646 630 614 598 582 566 550 534 518 502 486 470 454 438 422 406 390 374 358 342 326 310 294 278 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 #0 doadump () at pcpu.h:172 172 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); (kgdb) list *0xffffffff803ba56c 0xffffffff803ba56c is in key_spddelete2 (../../../netipsec/key.c:2160). 2155 if ((sp = key_getspbyid(id)) == NULL) { 2156 ipseclog((LOG_DEBUG, "%s: no SP found id:%u.\n", __func__, id)); 2157 key_senderror(so, m, EINVAL); 2158 } 2159 2160 sp->state = IPSEC_SPSTATE_DEAD; 2161 KEY_FREESP(&sp); 2162 2163 { 2164 struct mbuf *n, *nn; (kgdb) backtrace #0 doadump () at pcpu.h:172 #1 0x0000000000000004 in ?? () #2 0xffffffff802d9b07 in boot (howto=260) at ../../../kern/kern_shutdown.c:409 #3 0xffffffff802da1a1 in panic (fmt=0xffffff002a1b4980 "") at ../../../kern/kern_shutdown.c:565 #4 0xffffffff804c16df in trap_fatal (frame=0xffffff002a1b4980, eva=18446742974936698880) at ../../../amd64/amd64/trap.c:660 #5 0xffffffff804c19ff in trap_pfault (frame=0xffffffffa8163690, usermode=0) at ../../../amd64/amd64/trap.c:573 #6 0xffffffff804c1cb3 in trap (frame= {tf_rdi = -1474939040, tf_rsi = -1098805196416, tf_rdx = 1, tf_rcx = 0, tf_r8 = 2816, tf_r9 = 1, tf_rax = 0, tf_rbx = -1474938976, tf_rbp = 371, tf_r10 = 1, tf_r11 = 8, tf_r12 = -1098494254848, tf_r 13 = -1098814477304, tf_r14 = -1474938976, tf_r15 = -1098814477304, tf_trapno = 12, tf_addr = 160, tf_ flags = -1474938976, tf_err = 2, tf_rip = -2143574676, tf_cs = 8, tf_rflags = 66182, tf_rsp = -1474939 056, tf_ss = 16}) at ../../../amd64/amd64/trap.c:352 #7 0xffffffff804ace9b in calltrap () at ../../../amd64/amd64/exception.S:168 #8 0xffffffff803ba56c in key_spddelete2 (so=0xffffff00298dac08, m=0xffffff003ca3e100, mhp=0xffffffffa81637a0) at ../../../netipsec/key.c:2161 #9 0xffffffff803c1f24 in key_parse (m=0xffffff003ca3e100, so=0xffffff00298dac08) at ../../../netipsec/key.c:7303 #10 0xffffffff803c31db in key_output (m=0xffffff003ca3e100, so=0xffffff00298dac08) at ../../../netipsec/keysock.c:121 #11 0xffffffff8036f72a in raw_usend (so=0xffffffffa8163760, flags=706431360, m=0x1, nam=0x0, control=0x0, td=0x1) at ../../../net/raw_usrreq.c:263 #12 0xffffffff803c3ada in key_send (so=0xffffffffa8163760, flags=706431360, m=0x1, nam=0x0, control=0xb00, td=0x1) at ../../../netipsec/keysock.c:520 #13 0xffffffff8032001e in sosend (so=0xffffff00298dac08, addr=0x0, uio=0xffffffffa8163a90, top=0xffffff003ca3e100, control=0x0, flags=0, td=0xffffff002a1b4980) at ../../../kern/uipc_socket.c:836 #14 0xffffffff80327aa9 in kern_sendit (td=0xffffff002a1b4980, s=11, mp=0xffffffffa8163b50, flags=0, control=0x0, segflg=72) at ../../../kern/uipc_syscalls.c:772 #15 0xffffffff80328eb7 in sendit (td=0xffffff002a1b4980, s=11, mp=0xffffffffa8163b50, flags=0) at ../../../kern/uipc_syscalls.c:712 #16 0xffffffff80329234 in sendto (td=0xffffffffa8163760, uap=0xffffff002a1b4980) at ../../../kern/uipc_syscalls.c:830 #17 0xffffffff804c2531 in syscall (frame= {tf_rdi = 11, tf_rsi = 5918592, tf_rdx = 72, tf_rcx = 0, tf_r8 = 0, tf_r9 = 0, tf_rax = 133, tf_ rbx = 34375701621, tf_rbp = 140737475749200, tf_r10 = 140737475744576, tf_r11 = 3, tf_r12 = 5957024, t f_r13 = 5738496, tf_r14 = 0, tf_r15 = 35, tf_trapno = 22, tf_addr = 0, tf_flags = 5859032, tf_err = 2, tf_rip = 34379888668, tf_cs = 43, tf_rflags = 582, tf_rsp = 140737475749160, tf_ss = 35}) at ../../../amd64/amd64/trap.c:792 #18 0xffffffff804ad038 in Xfast_syscall () at ../../../amd64/amd64/exception.S:270 #19 0x000000080133781c in ?? () Previous frame inner to this frame (corrupt stack?) --------------070209010107060903070601--