From owner-freebsd-current@FreeBSD.ORG Fri Aug 26 22:18:52 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 692CF106564A for ; Fri, 26 Aug 2011 22:18:52 +0000 (UTC) (envelope-from mxeconomou@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 3066B8FC13 for ; Fri, 26 Aug 2011 22:18:51 +0000 (UTC) Received: by ywo32 with SMTP id 32so3910620ywo.13 for ; Fri, 26 Aug 2011 15:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=yYwH4mIFa3souVCsphlFgPRQazbKqBoluApyEp5uZ5o=; b=CxC/3Za9oQhXGGglZK2iIMsvBgd4Ji8+91pHdEaYi4BffkcuF52XQeukjEyKWaHwAR UhHgDTjaeXiAj7mC2Un+75L20upjA0894u/8HucUHy9EG26hsbHE35Q6QZOF07vye2kE SSuGWqr+iPFtZCS2W8+PpvCrb2z+jy7wcTYU8= MIME-Version: 1.0 Received: by 10.151.105.12 with SMTP id h12mr2622379ybm.124.1314395508691; Fri, 26 Aug 2011 14:51:48 -0700 (PDT) Received: by 10.147.125.18 with HTTP; Fri, 26 Aug 2011 14:51:48 -0700 (PDT) Date: Fri, 26 Aug 2011 17:51:48 -0400 Message-ID: From: Matthew Economou To: freebsd-current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: "panic: mutex pf task mtx owned at /usr/src/sys/contrib/pf/net/if_pfsync.c:3163" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Aug 2011 22:18:52 -0000 I recently upgraded a firewall I'm using for performance testing from a March-ish 9-CURRENT to 9.0-BETA1 (csup run August 21 around 12:00 AM EDT). It's basically a GENERIC kernel with debugging disabled and things like IPsec and ALTQ enabled. Since the upgrade, after approximately an hour after it boots, the firewall stops passing any traffic (IPv4 and IPv6). OpenVPN, for example, logs the following errors: write UDPv4: Operation not permitted (code=1) Quagga, for another example, logs something similar: ripd[1696]: can't send packet : Operation not permitted0 ospfd[1702]: *** sendmsg in ospf_write failed to 172.30.0.3, id 0, off 0, len 76, interface tap0 mtu 1500: Operation not permitted If I try to ping something from the console, I get the same error message: # ping 4.2.2.2 ping: sendto: Operation not permitted It appears that PF isn't removing any entries from the state table. Note that the state table size is at its default of 10000 (which correlates to the amount of memory installed on the firewall - 256 MB). State Table Total Rate current entries 10013 searches 554801 13.4/s inserts 10013 0.2/s removals 0 0.0/s I've tried both my current (unmodified and working prior to the upgrade) and experimental PF configurations, neither of which have any effect on the problem. Reloading the PF configuration (/etc/rc.d/pf reload) or restarting PF altogether (/etc/rc.d/pf restart) also have no effect. Only if I shut down PF completely (/etc/rc.d/pf stop) do I regain network connectivity - I can do things like ping hosts (IPv4 and IPv6), browse the web, and pass traffic that's just routed through the firewall (i.e., not requiring NAT). Clearing the state table (pfsync -F state) has no effect. The kernel I'm was running had debugging disabled for performance testing purposes, so I booted a proper debug kernel. It panicked in pfsync_send_plus as soon as init enabled PF (backtrace included below). Starting pflog. pflog0: promiscuous mode enabled Aug 25 20:54:21 pflogd[1611]: [priv]: msg PRIV_OPEN_LOG received Enabling pfpanic: mutex pf task mtx owned at /usr/src/sys/contrib/pf/net/if_pfsync.c:3163 cpuid = 0 KDB: enter: panic [ thread pid 1619 tid 100053 ] Stopped at kdb_enter+0x3a: movl $0,kdb_why db> bt Tracing pid 1619 tid 100053 td 0xc23da2e0 kdb_enter(c09777c9,c09777c9,c0975d7b,c6fd79e0,0,...) at kdb_enter+0x3a panic(c0975d7b,c0946080,c0944e87,c5b,c6fd7a0c,...) at panic+0x134 _mtx_assert(c0a1b388,0,c0944e87,c5b,c6fd7a24,...) at _mtx_assert+0x127 pfsync_send_plus(c6fd7a24,18,10,ad6,1000000,...) at pfsync_send_plus+0xf2 pfsync_clear_states(a218d664,c236fb78,c0945f1c,635,c09ae167,...) at pfsync_clear_states+0x8d pfioctl(c22a0800,c0cc4412,c236fb00,3,c23da2e0,...) at pfioctl+0x1b90 devfs_ioctl_f(c23ce578,c0cc4412,c236fb00,c216ce80,c23da2e0,...) at devfs_ioctl_f+0x10b kern_ioctl(c23da2e0,3,c0cc4412,c236fb00,1fd7cec,...) at kern_ioctl+0x21d ioctl(c23da2e0,c6fd7cec,c6fd7d28,c097d93a,0,...) at ioctl+0x134 syscallenter(c23da2e0,c6fd7ce4,c6fd7ce4,0,0,...) at syscallenter+0x263 syscall(c6fd7d28) at syscall+0x34 Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x281e6263, esp = 0xbfbfe8ac, ebp = 0xbfbfe998 --- db> I'm at a loss as to how to proceed. Is this a known problem with PF? Can anyone suggest a work-around? Best wishes, Matthew