From owner-freebsd-bugs@FreeBSD.ORG Thu Oct 6 21:10:14 2005 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1C98816A41F for ; Thu, 6 Oct 2005 21:10:14 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8733443D48 for ; Thu, 6 Oct 2005 21:10:13 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id j96LADol032388 for ; Thu, 6 Oct 2005 21:10:13 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.3/8.13.1/Submit) id j96LADsa032387; Thu, 6 Oct 2005 21:10:13 GMT (envelope-from gnats) Resent-Date: Thu, 6 Oct 2005 21:10:13 GMT Resent-Message-Id: <200510062110.j96LADsa032387@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Mark Gooderum Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7136F16A41F for ; Thu, 6 Oct 2005 21:01:11 +0000 (GMT) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [216.136.204.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3482343D46 for ; Thu, 6 Oct 2005 21:01:11 +0000 (GMT) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j96L19Tg092181 for ; Thu, 6 Oct 2005 21:01:09 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id j96L185Y092180; Thu, 6 Oct 2005 21:01:08 GMT (envelope-from nobody) Message-Id: <200510062101.j96L185Y092180@www.freebsd.org> Date: Thu, 6 Oct 2005 21:01:08 GMT From: Mark Gooderum To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-2.3 Cc: Subject: kern/87014: BPF_MTAP/bpf_mtap are not threadsafe and cause panics on SMP systems X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Oct 2005 21:10:14 -0000 >Number: 87014 >Category: kern >Synopsis: BPF_MTAP/bpf_mtap are not threadsafe and cause panics on SMP systems >Confidential: no >Severity: serious >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Oct 06 21:10:12 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Mark Gooderum >Release: 5.3-RELEASE >Organization: Vernier Networks, Inc. >Environment: FreeBSD 139.94.1.20 5.3-RELEASE FreeBSD 5.3-RELEASE #0: Thu Aug 4 09:03:53 PDT 2005 build@build-amd3.verniernetworks.com:/usr/build/ambit2/freebsd5/sys/i386/compile/VNISMP i386 >Description: BPF_MTAP/BPF_MTAP2 do a non-atomic test and invoke based on the value of the if_bpf field. The problem is that if the last bpf of on the interface is deleted, the if_bpf field is set to NULL by bpf_detachd(). If another thread (such as a userland process) deletes the last bpf in the (admittedly small) window between the test and invocation then bpf_mtap() is invoked with a NULL bp parameter which cases a fault on the LIST_EMPTY check #define BPF_MTAP(_ifp,_m) do { \ if ((_ifp)->if_bpf) { \ M_ASSERTVALID(_m); \ bpf_mtap((_ifp)->if_bpf, (_m)); \ } \ } while (0) This happens becase on i386 at least the initial NULL check does _NOT_ fetch the variable: Line 3359 of "../../../dev/bge/if_bge.c" starts at address 0xc045e7a5 and ends at 0xc045e7c0 . 0xc045e7a5 : cmpl $0x0,0x2074(%edi,%eax,4) 0xc045e7ad : jne 0xc045ea71 0xc045e7b3 : lea 0xfc(%esi),%eax 0xc045e7b9 : mov %eax,0xffffffec(%ebp) 0xc045e7bc : lea 0x0(%esi),%es It just does an optimized test for NULLness, the value isn't fetched until later when setting up the call: 0xc045ea46 : mov %ebx,0x4(%esp) 0xc045ea4a : mov 0x3c(%esi),%eax 0xc045ea4d : mov %eax,(%esp) 0xc045ea50 : call 0xc05897d0 0xc045ea55 : lea 0x0(%esi),%esi 0xc045ea59 : lea 0x0(%edi),%edi 0xc045ea60 : mov 0xfffffff0(%ebp),%eax 0xc045ea63 : cmpl $0x0,0x2074(%edi,%eax,4) 0xc045ea6b : je 0xc045e7c0 So the window is small but real. Our field experience is with a box doing a large amount of packet processing while running frequent nessus scans (nessus adds and removes BPF filters on the fly as needed for certain tests). >How-To-Repeat: Have lots of bpf filter add/deletes happening on a system under a heavy packet load. I will attach a simple test program that adds/deletes filters on an interface at a high rate if desired. This window also affects BPF_MTAP2/bpf_mtap2() >Fix: Either modify bpf_mtap()/bpf_mtap2() to check for a NULL parameter (the test/modify race doesn't apply once we are into the function because the bpf_if lasts as long as the interface, only the pointer to it in the struct ifnet comes and goes and by the time we're in bpf_mtap() we're looking a copy of the variable on the stack, not the actual ifnet field. Most interfaces have per-interface locks that could also be used but those mutexes are currently private to the drivers. --- /tmp/tmp.97907.0 Thu Oct 6 15:57:52 2005 +++ sys/net/bpf.c Thu Oct 6 15:57:45 2005 @@ -1201,20 +1201,27 @@ */ void bpf_mtap(bp, m) struct bpf_if *bp; struct mbuf *m; { struct bpf_d *d; u_int pktlen, slen; /* + * We can sometimes be invoked w/NULL bp due to a small race in + * BPF_MTAP(), see PR#xxxxx. + */ + if (!bp) + return; + + /* * Lockless read to avoid cost of locking the interface if there are * no descriptors attached. */ if (LIST_EMPTY(&bp->bif_dlist)) return; pktlen = m_length(m, NULL); if (pktlen == m->m_len) { bpf_tap(bp, mtod(m, u_char *), pktlen); return; @@ -1245,20 +1252,27 @@ void bpf_mtap2(bp, data, dlen, m) struct bpf_if *bp; void *data; u_int dlen; struct mbuf *m; { struct mbuf mb; struct bpf_d *d; u_int pktlen, slen; + + /* + * We can sometimes be invoked w/NULL bp due to a small race in + * BPF_MTAP2(), see PR#xxxxx. + */ + if (!bp) + return; /* * Lockless read to avoid cost of locking the interface if there are * no descriptors attached. */ if (LIST_EMPTY(&bp->bif_dlist)) return; pktlen = m_length(m, NULL); /* >Release-Note: >Audit-Trail: >Unformatted: