From owner-freebsd-net@FreeBSD.ORG Mon Mar 26 11:02:53 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 1F7FF106566C; Mon, 26 Mar 2012 11:02:53 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx2.freebsd.org (Postfix) with ESMTP id 66A7E150CD8; Mon, 26 Mar 2012 11:02:51 +0000 (UTC) Message-ID: <4F704C1B.5080709@FreeBSD.org> Date: Mon, 26 Mar 2012 14:59:39 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:8.0) Gecko/20111117 Thunderbird/8.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Content-Type: multipart/mixed; boundary="------------060905060202080703000205" Cc: Gleb Smirnoff Subject: [PATCH] BPF locking redesign X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Mar 2012 11:02:53 -0000 This is a multi-part message in MIME format. --------------060905060202080703000205 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello list! There are some patches that can significantly improve forwarding performance if BPF consumers are present. However, some changes are a bit hackish and ABI change is required. Those are split into separate patches I want to discuss. You probably need to merge r233505 for patches to work. Description: bpf_rwlock This is simple and straight-forwarded, we convert interface and descriptor locks to rwlock(9). Additionally, filter(descriptor) (reader) lock in bpf_mtap[2] is removed. This was suggested by glebius@. We protect filter by requesting interface writer lock on filter change. This greately improves performance: in most common case we need to acquire 1 reader lock instead of 2 mutexes. bpf_if structure is now covered by BPF_INTERNAL define. This permits including bpf.h without including rwlock stuff. However, this is is temporary solution, struct bpf_if should be made opaque for any external caller. Description: bpf_writers Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send raw ethernet frames. The only FreeBSD interface that can be used to send raw frames is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses BPF only to send data. This leads us to the situation when software like cdpd, being run on high-traffic-volume interface significantly reduces overall performance since we have to acquire additional locks for every packet. Here we add sysctl that changes BPF behavior in the following way: If program came and opens BPF socket without explicitly specifyin read filter we assume it to be write-only and add it to special writer-only per-interface list. This makes bpf_peers_present() return 0, so no additional overhead is introduced. After filter is supplied, descriptor is added to original per-interface list permitting packets to be captured. Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of setting snap length. Fortunately, most programs explicitly sets (event catch-all) filter after that. tcpdump(1) is a good example. So a bit hackis approach is taken: we upgrade description only after second BIOCSETF is received. Sysctl is named net.bpf.optimize_writers and is turned off by default. Description: bpf_if_opaque. No patch at the moment. We can probably do the following: Create bpf_if structure on bpfattach2, but do not assign it to supplied pointer. Instead we put new structure, pointer, interface pointer to some kind of hash (with interface name as key). When _reader_ comes AND sets valid filter, we checks if we need to attach bpfif to interface. When reader disconnects, we set bpfif pointer back to zero. No additional locking is required here: same struct bpfif lives as long as interface exists, so pointer will be either NULL or pointer to structure in and period of time. Even if some thread on other CPU sees non-coherent value - no problem. NULL means no filter is set so we skips BPF, non-null starts BPF_MTAP which acquires rlock and determines that no peers are present. As a result, bpf_peers_present(x) simply returns (x). There is no need to expose struct bpfif to external users. Additionally, we do not request interface write lock to be acquired on every interface attach/departure which can potentially lead to small number of packets being dropped on mpd servers. Btw, we can consider changing rlock / wlock to _try_ one to avoid this behavior. Major drawback of this approach is totally broken ABI. Any pre-compiled network driver carries inlined bpf_peers_present() which assumes ifp->if_bpf to be non-NULL. However, we can introduce bpfattach3() or some special interface flag (set by if_alloc(), for example), indicating that driver uses new api, and retain original behavior for old drivers. -- WBR, Alexander --------------060905060202080703000205 Content-Type: text/plain; name="bpf_writers.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="bpf_writers.diff" >From 2ca09cf74ef63fbde0ccede4a7e883c6f0add51d Mon Sep 17 00:00:00 2001 From: "Alexander V. Chernikov" Date: Mon, 26 Mar 2012 14:58:13 +0400 Subject: [PATCH 2/2] Optimize BPF writers --- share/man/man4/bpf.4 | 31 +++++++++++++--- sys/net/bpf.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++---- sys/net/bpf.h | 1 + sys/net/bpfdesc.h | 1 + 4 files changed, 120 insertions(+), 10 deletions(-) diff --git a/share/man/man4/bpf.4 b/share/man/man4/bpf.4 index b9d6742..44bed39 100644 --- a/share/man/man4/bpf.4 +++ b/share/man/man4/bpf.4 @@ -952,10 +952,33 @@ array initializers: .Fn BPF_STMT opcode operand and .Fn BPF_JUMP opcode operand true_offset false_offset . -.Sh FILES -.Bl -tag -compact -width /dev/bpf -.It Pa /dev/bpf -the packet filter device +.Sh SYSCTL VARIABLES +A set of +.Xr sysctl 8 +variables controls the behaviour of the +.Nm +subsystem +.Bl -tag -width indent +.It Va net.bpf.optimize_writers: No 0 +Various programs use BPF to send (but not receive) raw packets +(cdpd, lldpd, dhcpd, dhcp relays, etc. are good examples of such programs). +They do not need incoming packets to be send to them. Turning this option on +makes new BPF users to be attached to write-only interface list until program +explicitly specifies read filter via +.Cm pcap_set_filter() . +This removes any performance degradation for high-speed interfaces. +.It Va net.bpf.stats: +Binary interface for retrieving general statistics. +.It Va net.bpf.zerocopy_enable: No 0 +Permits zero-copy to be used with net BPF readers. Use with caution. +.It Va net.bpf.maxinsns: No 512 +Maximum number of instructions that BPF program can contain. Use +.Xr tcpdump 1 +-d option to determine approximate number of instruction for any filter. +.It Va net.bpf.maxbufsize: No 524288 +Maximum buffer size to allocate for packets buffer. +.It Va net.bpf.bufsize: No 4096 +Default buffer size to allocate for packets buffer. .El .Sh EXAMPLES The following filter is taken from the Reverse ARP Daemon. diff --git a/sys/net/bpf.c b/sys/net/bpf.c index c1d4da7..a3a64ad 100644 --- a/sys/net/bpf.c +++ b/sys/net/bpf.c @@ -176,6 +176,12 @@ SYSCTL_INT(_net_bpf, OID_AUTO, zerocopy_enable, CTLFLAG_RW, static SYSCTL_NODE(_net_bpf, OID_AUTO, stats, CTLFLAG_MPSAFE | CTLFLAG_RW, bpf_stats_sysctl, "bpf statistics portal"); +static VNET_DEFINE(int, bpf_optimize_writers) = 0; +#define V_bpf_optimize_writers VNET(bpf_optimize_writers) +SYSCTL_VNET_INT(_net_bpf, OID_AUTO, optimize_writers, + CTLFLAG_RW, &VNET_NAME(bpf_optimize_writers), 0, + "Do not send packets until BPF program is set"); + static d_open_t bpfopen; static d_read_t bpfread; static d_write_t bpfwrite; @@ -572,17 +578,66 @@ static void bpf_attachd(struct bpf_d *d, struct bpf_if *bp) { /* - * Point d at bp, and add d to the interface's list of listeners. - * Finally, point the driver's bpf cookie at the interface so - * it will divert packets to bpf. + * Point d at bp, and add d to the interface's list. + * Since there are many applicaiotns using BPF for + * sending raw packets only (dhcpd, cdpd are good examples) + * we can delay adding d to the list of active listeners until + * some filter is configured. */ - BPFIF_WLOCK(bp); d->bd_bif = bp; - LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next); + BPFIF_WLOCK(bp); + + if (V_bpf_optimize_writers != 0) { + /* Add to writers-only list */ + LIST_INSERT_HEAD(&bp->bif_wlist, d, bd_next); + /* + * We decrement bd_writer on every filter set operation. + * First BIOCSETF is done by pcap_open_live() to set up + * snap length. After that appliation usually sets its own filter + */ + d->bd_writer = 2; + } else + LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next); + + BPFIF_WUNLOCK(bp); + + BPF_LOCK(); bpf_bpfd_cnt++; + BPF_UNLOCK(); + + CTR3(KTR_NET, "%s: bpf_attach called by pid %d, adding to %s list", + __func__, d->bd_pid, d->bd_writer ? "writer" : "active"); + + if (V_bpf_optimize_writers == 0) + EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1); +} + +/* + * Add d to the list of active bp filters. + * Reuqires bpf_attachd() to be called before + */ +static void +bpf_upgraded(struct bpf_d *d) +{ + struct bpf_if *bp; + + bp = d->bd_bif; + + BPFIF_WLOCK(bp); + BPFD_WLOCK(d); + + /* Remove from writers-only list */ + LIST_REMOVE(d, bd_next); + LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next); + /* Mark d as reader */ + d->bd_writer = 0; + + BPFD_WUNLOCK(d); BPFIF_WUNLOCK(bp); + CTR2(KTR_NET, "%s: upgrade required by pid %d", __func__, d->bd_pid); + EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1); } @@ -596,12 +651,17 @@ bpf_detachd(struct bpf_d *d) struct bpf_if *bp; struct ifnet *ifp; + CTR2(KTR_NET, "%s: detach required by pid %d", __func__, d->bd_pid); + BPF_LOCK_ASSERT(); bp = d->bd_bif; BPFIF_WLOCK(bp); BPFD_WLOCK(d); + /* Save bd_writer value */ + error = d->bd_writer; + /* * Remove d from the interface's descriptor list. */ @@ -615,7 +675,9 @@ bpf_detachd(struct bpf_d *d) /* We're already protected by global lock. */ bpf_bpfd_cnt--; - EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0); + /* Call event handler iff d is attached */ + if (error == 0) + EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0); /* * Check if this descriptor had requested promiscuous mode. @@ -1536,6 +1598,7 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) #ifdef COMPAT_FREEBSD32 struct bpf_program32 *fp32; struct bpf_program fp_swab; + int need_upgrade = 0; if (cmd == BIOCSETWF32 || cmd == BIOCSETF32 || cmd == BIOCSETFNR32) { fp32 = (struct bpf_program32 *)fp; @@ -1611,6 +1674,16 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) #endif if (cmd == BIOCSETF) reset_d(d); + + /* + * Do not require upgrade by first BIOCSETF + * (used to set snaplen) by pcap_open_live() + */ + if ((d->bd_writer != 0) && (--d->bd_writer == 0)) + need_upgrade = 1; + CTR4(KTR_NET, "%s: filter function set by pid %d, " + "bd_writer counter %d, need_upgrade %d", + __func__, d->bd_pid, d->bd_writer, need_upgrade); } BPFD_WUNLOCK(d); BPFIF_WUNLOCK(d->bd_bif); @@ -1621,6 +1694,10 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) bpf_destroy_jit_filter(ofunc); #endif + /* Move d to active readers list */ + if (need_upgrade != 0) + bpf_upgraded(d); + return (0); } free((caddr_t)fcode, M_BPF); @@ -2265,6 +2342,7 @@ bpfattach2(struct ifnet *ifp, u_int dlt, u_int hdrlen, struct bpf_if **driverp) panic("bpfattach"); LIST_INIT(&bp->bif_dlist); + LIST_INIT(&bp->bif_wlist); bp->bif_ifp = ifp; bp->bif_dlt = dlt; rw_init(&bp->bif_lock, "bpf interface lock"); @@ -2520,6 +2598,13 @@ bpf_stats_sysctl(SYSCTL_HANDLER_ARGS) index = 0; LIST_FOREACH(bp, &bpf_iflist, bif_next) { BPFIF_RLOCK(bp); + /* Send writers-only first */ + LIST_FOREACH(bd, &bp->bif_wlist, bd_next) { + xbd = &xbdbuf[index++]; + BPFD_RLOCK(bd); + bpfstats_fill_xbpf(xbd, bd); + BPFD_RUNLOCK(bd); + } LIST_FOREACH(bd, &bp->bif_dlist, bd_next) { xbd = &xbdbuf[index++]; BPFD_RLOCK(bd); diff --git a/sys/net/bpf.h b/sys/net/bpf.h index 4c4f0c3..01e1bd2 100644 --- a/sys/net/bpf.h +++ b/sys/net/bpf.h @@ -1104,6 +1104,7 @@ struct bpf_if { u_int bif_hdrlen; /* length of link header */ struct ifnet *bif_ifp; /* corresponding interface */ struct rwlock bif_lock; /* interface lock */ + LIST_HEAD(, bpf_d) bif_wlist; /* writer-only list */ #endif }; diff --git a/sys/net/bpfdesc.h b/sys/net/bpfdesc.h index 9ea4522..e11fdc6 100644 --- a/sys/net/bpfdesc.h +++ b/sys/net/bpfdesc.h @@ -79,6 +79,7 @@ struct bpf_d { u_char bd_promisc; /* true if listening promiscuously */ u_char bd_state; /* idle, waiting, or timed out */ u_char bd_immediate; /* true to return on packet arrival */ + u_char bd_writer; /* non-zero if d is writer-only */ int bd_hdrcmplt; /* false to fill in src lladdr automatically */ int bd_direction; /* select packet direction */ int bd_tstamp; /* select time stamping function */ -- 1.7.9.4 --------------060905060202080703000205 Content-Type: text/plain; name="bpf_rwlock.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="bpf_rwlock.diff" >From 00a39222ce5721e2fc3925657c3b1e69502d59b5 Mon Sep 17 00:00:00 2001 From: "Alexander V. Chernikov" Date: Mon, 26 Mar 2012 14:57:09 +0400 Subject: [PATCH 1/2] Change mutexes to rwlocks --- sys/kern/subr_witness.c | 4 +- sys/net/bpf.c | 251 +++++++++++++++++++++++++------------------- sys/net/bpf.h | 7 +- sys/net/bpf_buffer.c | 6 +- sys/net/bpf_zerocopy.c | 10 +- sys/net/bpfdesc.h | 23 ++-- sys/security/mac/mac_net.c | 2 + 7 files changed, 180 insertions(+), 123 deletions(-) diff --git a/sys/kern/subr_witness.c b/sys/kern/subr_witness.c index 7853e69..53529b3 100644 --- a/sys/kern/subr_witness.c +++ b/sys/kern/subr_witness.c @@ -563,8 +563,8 @@ static struct witness_order_list_entry order_lists[] = { * BPF */ { "bpf global lock", &lock_class_mtx_sleep }, - { "bpf interface lock", &lock_class_mtx_sleep }, - { "bpf cdev lock", &lock_class_mtx_sleep }, + { "bpf interface lock", &lock_class_rw }, + { "bpf cdev lock", &lock_class_rw }, { NULL, NULL }, /* * NFS server diff --git a/sys/net/bpf.c b/sys/net/bpf.c index cf8f2ec..c1d4da7 100644 --- a/sys/net/bpf.c +++ b/sys/net/bpf.c @@ -43,6 +43,8 @@ __FBSDID("$FreeBSD: head/sys/net/bpf.c 232449 2012-03-03 08:19:18Z jmallett $"); #include #include +#include +#include #include #include #include @@ -66,6 +68,7 @@ __FBSDID("$FreeBSD: head/sys/net/bpf.c 232449 2012-03-03 08:19:18Z jmallett $"); #include #include +#define BPF_INTERNAL #include #include #ifdef BPF_JITTER @@ -207,7 +210,7 @@ bpf_append_bytes(struct bpf_d *d, caddr_t buf, u_int offset, void *src, u_int len) { - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); switch (d->bd_bufmode) { case BPF_BUFMODE_BUFFER: @@ -227,7 +230,7 @@ bpf_append_mbuf(struct bpf_d *d, caddr_t buf, u_int offset, void *src, u_int len) { - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); switch (d->bd_bufmode) { case BPF_BUFMODE_BUFFER: @@ -249,7 +252,7 @@ static void bpf_buf_reclaimed(struct bpf_d *d) { - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); switch (d->bd_bufmode) { case BPF_BUFMODE_BUFFER: @@ -290,7 +293,6 @@ bpf_canfreebuf(struct bpf_d *d) static int bpf_canwritebuf(struct bpf_d *d) { - BPFD_LOCK_ASSERT(d); switch (d->bd_bufmode) { @@ -309,7 +311,7 @@ static void bpf_buffull(struct bpf_d *d) { - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); switch (d->bd_bufmode) { case BPF_BUFMODE_ZBUF: @@ -325,7 +327,7 @@ void bpf_bufheld(struct bpf_d *d) { - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); switch (d->bd_bufmode) { case BPF_BUFMODE_ZBUF: @@ -574,12 +576,12 @@ bpf_attachd(struct bpf_d *d, struct bpf_if *bp) * Finally, point the driver's bpf cookie at the interface so * it will divert packets to bpf. */ - BPFIF_LOCK(bp); + BPFIF_WLOCK(bp); d->bd_bif = bp; LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next); bpf_bpfd_cnt++; - BPFIF_UNLOCK(bp); + BPFIF_WUNLOCK(bp); EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1); } @@ -594,20 +596,24 @@ bpf_detachd(struct bpf_d *d) struct bpf_if *bp; struct ifnet *ifp; + BPF_LOCK_ASSERT(); + bp = d->bd_bif; - BPFIF_LOCK(bp); - BPFD_LOCK(d); - ifp = d->bd_bif->bif_ifp; + BPFIF_WLOCK(bp); + BPFD_WLOCK(d); /* * Remove d from the interface's descriptor list. */ LIST_REMOVE(d, bd_next); - bpf_bpfd_cnt--; + ifp = bp->bif_ifp; d->bd_bif = NULL; - BPFD_UNLOCK(d); - BPFIF_UNLOCK(bp); + BPFD_WUNLOCK(d); + BPFIF_WUNLOCK(bp); + + /* We're already protected by global lock. */ + bpf_bpfd_cnt--; EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0); @@ -642,16 +648,16 @@ bpf_dtor(void *data) { struct bpf_d *d = data; - BPFD_LOCK(d); + BPFD_WLOCK(d); if (d->bd_state == BPF_WAITING) callout_stop(&d->bd_callout); d->bd_state = BPF_IDLE; - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); funsetown(&d->bd_sigio); - mtx_lock(&bpf_mtx); + BPF_LOCK(); if (d->bd_bif) bpf_detachd(d); - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); #ifdef MAC mac_bpfdesc_destroy(d); #endif /* MAC */ @@ -689,14 +695,14 @@ bpfopen(struct cdev *dev, int flags, int fmt, struct thread *td) d->bd_bufmode = BPF_BUFMODE_BUFFER; d->bd_sig = SIGIO; d->bd_direction = BPF_D_INOUT; - d->bd_pid = td->td_proc->p_pid; + BPF_PID_REFRESH(d, td); #ifdef MAC mac_bpfdesc_init(d); mac_bpfdesc_create(td->td_ucred, d); #endif - mtx_init(&d->bd_mtx, devtoname(dev), "bpf cdev lock", MTX_DEF); - callout_init_mtx(&d->bd_callout, &d->bd_mtx, 0); - knlist_init_mtx(&d->bd_sel.si_note, &d->bd_mtx); + rw_init(&d->bd_lock, "bpf cdev lock"); + callout_init_rw(&d->bd_callout, &d->bd_lock, 0); + knlist_init_rw_reader(&d->bd_sel.si_note, &d->bd_lock); return (0); } @@ -725,10 +731,10 @@ bpfread(struct cdev *dev, struct uio *uio, int ioflag) non_block = ((ioflag & O_NONBLOCK) != 0); - BPFD_LOCK(d); - d->bd_pid = curthread->td_proc->p_pid; + BPFD_WLOCK(d); + BPF_PID_REFRESH_CUR(d); if (d->bd_bufmode != BPF_BUFMODE_BUFFER) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (EOPNOTSUPP); } if (d->bd_state == BPF_WAITING) @@ -764,18 +770,18 @@ bpfread(struct cdev *dev, struct uio *uio, int ioflag) * it before using it again. */ if (d->bd_bif == NULL) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (ENXIO); } if (non_block) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (EWOULDBLOCK); } - error = msleep(d, &d->bd_mtx, PRINET|PCATCH, + error = rw_sleep(d, &d->bd_lock, PRINET|PCATCH, "bpf", d->bd_rtout); if (error == EINTR || error == ERESTART) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (error); } if (error == EWOULDBLOCK) { @@ -793,7 +799,7 @@ bpfread(struct cdev *dev, struct uio *uio, int ioflag) break; if (d->bd_slen == 0) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (0); } ROTATE_BUFFERS(d); @@ -803,7 +809,7 @@ bpfread(struct cdev *dev, struct uio *uio, int ioflag) /* * At this point, we know we have something in the hold slot. */ - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); /* * Move data from hold buffer into user space. @@ -816,12 +822,12 @@ bpfread(struct cdev *dev, struct uio *uio, int ioflag) */ error = bpf_uiomove(d, d->bd_hbuf, d->bd_hlen, uio); - BPFD_LOCK(d); + BPFD_WLOCK(d); d->bd_fbuf = d->bd_hbuf; d->bd_hbuf = NULL; d->bd_hlen = 0; bpf_buf_reclaimed(d); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (error); } @@ -833,7 +839,7 @@ static __inline void bpf_wakeup(struct bpf_d *d) { - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); if (d->bd_state == BPF_WAITING) { callout_stop(&d->bd_callout); d->bd_state = BPF_IDLE; @@ -851,7 +857,7 @@ bpf_timed_out(void *arg) { struct bpf_d *d = (struct bpf_d *)arg; - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); if (callout_pending(&d->bd_callout) || !callout_active(&d->bd_callout)) return; @@ -866,7 +872,7 @@ static int bpf_ready(struct bpf_d *d) { - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); if (!bpf_canfreebuf(d) && d->bd_hlen != 0) return (1); @@ -889,7 +895,7 @@ bpfwrite(struct cdev *dev, struct uio *uio, int ioflag) if (error != 0) return (error); - d->bd_pid = curthread->td_proc->p_pid; + BPF_PID_REFRESH_CUR(d); d->bd_wcount++; if (d->bd_bif == NULL) { d->bd_wdcount++; @@ -937,11 +943,11 @@ bpfwrite(struct cdev *dev, struct uio *uio, int ioflag) CURVNET_SET(ifp->if_vnet); #ifdef MAC - BPFD_LOCK(d); + BPFD_WLOCK(d); mac_bpfdesc_create_mbuf(d, m); if (mc != NULL) mac_bpfdesc_create_mbuf(d, mc); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); #endif error = (*ifp->if_output)(ifp, m, &dst, NULL); @@ -970,7 +976,7 @@ static void reset_d(struct bpf_d *d) { - mtx_assert(&d->bd_mtx, MA_OWNED); + BPFD_WLOCK_ASSERT(d); if ((d->bd_hbuf != NULL) && (d->bd_bufmode != BPF_BUFMODE_ZBUF || bpf_canfreebuf(d))) { @@ -1037,12 +1043,12 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, /* * Refresh PID associated with this descriptor. */ - BPFD_LOCK(d); - d->bd_pid = td->td_proc->p_pid; + BPFD_WLOCK(d); + BPF_PID_REFRESH(d, td); if (d->bd_state == BPF_WAITING) callout_stop(&d->bd_callout); d->bd_state = BPF_IDLE; - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); if (d->bd_locked == 1) { switch (cmd) { @@ -1108,11 +1114,11 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, { int n; - BPFD_LOCK(d); + BPFD_WLOCK(d); n = d->bd_slen; if (d->bd_hbuf) n += d->bd_hlen; - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); *(int *)addr = n; break; @@ -1163,9 +1169,9 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, * Flush read packet buffer. */ case BIOCFLUSH: - BPFD_LOCK(d); + BPFD_WLOCK(d); reset_d(d); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); break; /* @@ -1488,15 +1494,15 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, return (EINVAL); } - BPFD_LOCK(d); + BPFD_WLOCK(d); if (d->bd_sbuf != NULL || d->bd_hbuf != NULL || d->bd_fbuf != NULL || d->bd_bif != NULL) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); CURVNET_RESTORE(); return (EBUSY); } d->bd_bufmode = *(u_int *)addr; - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); break; case BIOCGETZMAX: @@ -1556,7 +1562,12 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) if (fp->bf_insns == NULL) { if (fp->bf_len != 0) return (EINVAL); - BPFD_LOCK(d); + /* + * Protect filter change by interface lock, too. + * The same lock order is used by bpf_detachd(). + */ + BPFIF_WLOCK(d->bd_bif); + BPFD_WLOCK(d); if (wfilter) d->bd_wfilter = NULL; else { @@ -1567,7 +1578,8 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) if (cmd == BIOCSETF) reset_d(d); } - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); + BPFIF_WUNLOCK(d->bd_bif); if (old != NULL) free((caddr_t)old, M_BPF); #ifdef BPF_JITTER @@ -1584,7 +1596,12 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) fcode = (struct bpf_insn *)malloc(size, M_BPF, M_WAITOK); if (copyin((caddr_t)fp->bf_insns, (caddr_t)fcode, size) == 0 && bpf_validate(fcode, (int)flen)) { - BPFD_LOCK(d); + /* + * Protect filter change by interface lock, too + * The same lock order is used by bpf_detachd(). + */ + BPFIF_WLOCK(d->bd_bif); + BPFD_WLOCK(d); if (wfilter) d->bd_wfilter = fcode; else { @@ -1595,7 +1612,8 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) if (cmd == BIOCSETF) reset_d(d); } - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); + BPFIF_WUNLOCK(d->bd_bif); if (old != NULL) free((caddr_t)old, M_BPF); #ifdef BPF_JITTER @@ -1659,9 +1677,9 @@ bpf_setif(struct bpf_d *d, struct ifreq *ifr) bpf_attachd(d, bp); } - BPFD_LOCK(d); + BPFD_WLOCK(d); reset_d(d); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (0); } @@ -1685,8 +1703,8 @@ bpfpoll(struct cdev *dev, int events, struct thread *td) * Refresh PID associated with this descriptor. */ revents = events & (POLLOUT | POLLWRNORM); - BPFD_LOCK(d); - d->bd_pid = td->td_proc->p_pid; + BPFD_WLOCK(d); + BPF_PID_REFRESH(d, td); if (events & (POLLIN | POLLRDNORM)) { if (bpf_ready(d)) revents |= events & (POLLIN | POLLRDNORM); @@ -1700,7 +1718,7 @@ bpfpoll(struct cdev *dev, int events, struct thread *td) } } } - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (revents); } @@ -1720,12 +1738,12 @@ bpfkqfilter(struct cdev *dev, struct knote *kn) /* * Refresh PID associated with this descriptor. */ - BPFD_LOCK(d); - d->bd_pid = curthread->td_proc->p_pid; + BPFD_WLOCK(d); + BPF_PID_REFRESH_CUR(d); kn->kn_fop = &bpfread_filtops; kn->kn_hook = d; knlist_add(&d->bd_sel.si_note, kn, 1); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (0); } @@ -1744,7 +1762,7 @@ filt_bpfread(struct knote *kn, long hint) struct bpf_d *d = (struct bpf_d *)kn->kn_hook; int ready; - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); ready = bpf_ready(d); if (ready) { kn->kn_data = d->bd_slen; @@ -1819,9 +1837,19 @@ bpf_tap(struct bpf_if *bp, u_char *pkt, u_int pktlen) int gottime; gottime = BPF_TSTAMP_NONE; - BPFIF_LOCK(bp); + + BPFIF_RLOCK(bp); + LIST_FOREACH(d, &bp->bif_dlist, bd_next) { - BPFD_LOCK(d); + /* + * We are not using any locks for d here because: + * 1) any filter change is protected by interface + * write lock + * 2) destroying/detaching d is protected by interface + * write lock, too + */ + + /* XXX: Do not protect counter for the sake of performance. */ ++d->bd_rcount; /* * NB: We dont call BPF_CHECK_DIRECTION() here since there is no @@ -1837,6 +1865,11 @@ bpf_tap(struct bpf_if *bp, u_char *pkt, u_int pktlen) #endif slen = bpf_filter(d->bd_rfilter, pkt, pktlen, pktlen); if (slen != 0) { + /* + * Filter matches. Let's to acquire write lock. + */ + BPFD_WLOCK(d); + d->bd_fcount++; if (gottime < bpf_ts_quality(d->bd_tstamp)) gottime = bpf_gettime(&bt, d->bd_tstamp, NULL); @@ -1845,10 +1878,10 @@ bpf_tap(struct bpf_if *bp, u_char *pkt, u_int pktlen) #endif catchpacket(d, pkt, pktlen, slen, bpf_append_bytes, &bt); + BPFD_WUNLOCK(d); } - BPFD_UNLOCK(d); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } #define BPF_CHECK_DIRECTION(d, r, i) \ @@ -1857,6 +1890,7 @@ bpf_tap(struct bpf_if *bp, u_char *pkt, u_int pktlen) /* * Incoming linkage from device drivers, when packet is in an mbuf chain. + * Locking model is explained in bpf_tap(). */ void bpf_mtap(struct bpf_if *bp, struct mbuf *m) @@ -1876,13 +1910,13 @@ bpf_mtap(struct bpf_if *bp, struct mbuf *m) } pktlen = m_length(m, NULL); - gottime = BPF_TSTAMP_NONE; - BPFIF_LOCK(bp); + + BPFIF_RLOCK(bp); + LIST_FOREACH(d, &bp->bif_dlist, bd_next) { if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, bp->bif_ifp)) continue; - BPFD_LOCK(d); ++d->bd_rcount; #ifdef BPF_JITTER bf = bpf_jitter_enable != 0 ? d->bd_bfilter : NULL; @@ -1893,6 +1927,8 @@ bpf_mtap(struct bpf_if *bp, struct mbuf *m) #endif slen = bpf_filter(d->bd_rfilter, (u_char *)m, pktlen, 0); if (slen != 0) { + BPFD_WLOCK(d); + d->bd_fcount++; if (gottime < bpf_ts_quality(d->bd_tstamp)) gottime = bpf_gettime(&bt, d->bd_tstamp, m); @@ -1901,10 +1937,10 @@ bpf_mtap(struct bpf_if *bp, struct mbuf *m) #endif catchpacket(d, (u_char *)m, pktlen, slen, bpf_append_mbuf, &bt); + BPFD_WUNLOCK(d); } - BPFD_UNLOCK(d); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } /* @@ -1938,14 +1974,17 @@ bpf_mtap2(struct bpf_if *bp, void *data, u_int dlen, struct mbuf *m) pktlen += dlen; gottime = BPF_TSTAMP_NONE; - BPFIF_LOCK(bp); + + BPFIF_RLOCK(bp); + LIST_FOREACH(d, &bp->bif_dlist, bd_next) { if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, bp->bif_ifp)) continue; - BPFD_LOCK(d); ++d->bd_rcount; slen = bpf_filter(d->bd_rfilter, (u_char *)&mb, pktlen, 0); if (slen != 0) { + BPFD_WLOCK(d); + d->bd_fcount++; if (gottime < bpf_ts_quality(d->bd_tstamp)) gottime = bpf_gettime(&bt, d->bd_tstamp, m); @@ -1954,10 +1993,10 @@ bpf_mtap2(struct bpf_if *bp, void *data, u_int dlen, struct mbuf *m) #endif catchpacket(d, (u_char *)&mb, pktlen, slen, bpf_append_mbuf, &bt); + BPFD_WUNLOCK(d); } - BPFD_UNLOCK(d); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } #undef BPF_CHECK_DIRECTION @@ -2049,7 +2088,7 @@ catchpacket(struct bpf_d *d, u_char *pkt, u_int pktlen, u_int snaplen, int do_timestamp; int tstype; - BPFD_LOCK_ASSERT(d); + BPFD_WLOCK_ASSERT(d); /* * Detect whether user space has released a buffer back to us, and if @@ -2196,7 +2235,7 @@ bpf_freed(struct bpf_d *d) } if (d->bd_wfilter != NULL) free((caddr_t)d->bd_wfilter, M_BPF); - mtx_destroy(&d->bd_mtx); + rw_destroy(&d->bd_lock); } /* @@ -2228,13 +2267,13 @@ bpfattach2(struct ifnet *ifp, u_int dlt, u_int hdrlen, struct bpf_if **driverp) LIST_INIT(&bp->bif_dlist); bp->bif_ifp = ifp; bp->bif_dlt = dlt; - mtx_init(&bp->bif_mtx, "bpf interface lock", NULL, MTX_DEF); + rw_init(&bp->bif_lock, "bpf interface lock"); KASSERT(*driverp == NULL, ("bpfattach2: driverp already initialized")); *driverp = bp; - mtx_lock(&bpf_mtx); + BPF_LOCK(); LIST_INSERT_HEAD(&bpf_iflist, bp, bif_next); - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); bp->bif_hdrlen = hdrlen; @@ -2261,14 +2300,14 @@ bpfdetach(struct ifnet *ifp) /* Find all bpf_if struct's which reference ifp and detach them. */ do { - mtx_lock(&bpf_mtx); + BPF_LOCK(); LIST_FOREACH(bp, &bpf_iflist, bif_next) { if (ifp == bp->bif_ifp) break; } if (bp != NULL) LIST_REMOVE(bp, bif_next); - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); if (bp != NULL) { #ifdef INVARIANTS @@ -2276,11 +2315,11 @@ bpfdetach(struct ifnet *ifp) #endif while ((d = LIST_FIRST(&bp->bif_dlist)) != NULL) { bpf_detachd(d); - BPFD_LOCK(d); + BPFD_WLOCK(d); bpf_wakeup(d); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); } - mtx_destroy(&bp->bif_mtx); + rw_destroy(&bp->bif_lock); free(bp, M_BPF); } } while (bp != NULL); @@ -2304,13 +2343,13 @@ bpf_getdltlist(struct bpf_d *d, struct bpf_dltlist *bfl) ifp = d->bd_bif->bif_ifp; n = 0; error = 0; - mtx_lock(&bpf_mtx); + BPF_LOCK(); LIST_FOREACH(bp, &bpf_iflist, bif_next) { if (bp->bif_ifp != ifp) continue; if (bfl->bfl_list != NULL) { if (n >= bfl->bfl_len) { - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); return (ENOMEM); } error = copyout(&bp->bif_dlt, @@ -2318,7 +2357,7 @@ bpf_getdltlist(struct bpf_d *d, struct bpf_dltlist *bfl) } n++; } - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); bfl->bfl_len = n; return (error); } @@ -2336,19 +2375,19 @@ bpf_setdlt(struct bpf_d *d, u_int dlt) if (d->bd_bif->bif_dlt == dlt) return (0); ifp = d->bd_bif->bif_ifp; - mtx_lock(&bpf_mtx); + BPF_LOCK(); LIST_FOREACH(bp, &bpf_iflist, bif_next) { if (bp->bif_ifp == ifp && bp->bif_dlt == dlt) break; } - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); if (bp != NULL) { opromisc = d->bd_promisc; bpf_detachd(d); bpf_attachd(d, bp); - BPFD_LOCK(d); + BPFD_WLOCK(d); reset_d(d); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); if (opromisc) { error = ifpromisc(bp->bif_ifp, 1); if (error) @@ -2386,22 +2425,22 @@ bpf_zero_counters(void) struct bpf_if *bp; struct bpf_d *bd; - mtx_lock(&bpf_mtx); + BPF_LOCK(); LIST_FOREACH(bp, &bpf_iflist, bif_next) { - BPFIF_LOCK(bp); + BPFIF_RLOCK(bp); LIST_FOREACH(bd, &bp->bif_dlist, bd_next) { - BPFD_LOCK(bd); + BPFD_WLOCK(bd); bd->bd_rcount = 0; bd->bd_dcount = 0; bd->bd_fcount = 0; bd->bd_wcount = 0; bd->bd_wfcount = 0; bd->bd_zcopy = 0; - BPFD_UNLOCK(bd); + BPFD_WUNLOCK(bd); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); } static void @@ -2472,24 +2511,24 @@ bpf_stats_sysctl(SYSCTL_HANDLER_ARGS) if (bpf_bpfd_cnt == 0) return (SYSCTL_OUT(req, 0, 0)); xbdbuf = malloc(req->oldlen, M_BPF, M_WAITOK); - mtx_lock(&bpf_mtx); + BPF_LOCK(); if (req->oldlen < (bpf_bpfd_cnt * sizeof(*xbd))) { - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); free(xbdbuf, M_BPF); return (ENOMEM); } index = 0; LIST_FOREACH(bp, &bpf_iflist, bif_next) { - BPFIF_LOCK(bp); + BPFIF_RLOCK(bp); LIST_FOREACH(bd, &bp->bif_dlist, bd_next) { xbd = &xbdbuf[index++]; - BPFD_LOCK(bd); + BPFD_RLOCK(bd); bpfstats_fill_xbpf(xbd, bd); - BPFD_UNLOCK(bd); + BPFD_RUNLOCK(bd); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); error = SYSCTL_OUT(req, xbdbuf, index * sizeof(*xbd)); free(xbdbuf, M_BPF); return (error); diff --git a/sys/net/bpf.h b/sys/net/bpf.h index c47ad1d..4c4f0c3 100644 --- a/sys/net/bpf.h +++ b/sys/net/bpf.h @@ -1092,14 +1092,19 @@ SYSCTL_DECL(_net_bpf); /* * Descriptor associated with each attached hardware interface. + * FIXME: this structure is exposed to external callers to speed up + * bpf_peers_present() call. However we cover all fields not needed by + * this function via BPF_INTERNAL define */ struct bpf_if { LIST_ENTRY(bpf_if) bif_next; /* list of all interfaces */ LIST_HEAD(, bpf_d) bif_dlist; /* descriptor list */ +#ifdef BPF_INTERNAL u_int bif_dlt; /* link layer type */ u_int bif_hdrlen; /* length of link header */ struct ifnet *bif_ifp; /* corresponding interface */ - struct mtx bif_mtx; /* mutex for interface */ + struct rwlock bif_lock; /* interface lock */ +#endif }; void bpf_bufheld(struct bpf_d *d); diff --git a/sys/net/bpf_buffer.c b/sys/net/bpf_buffer.c index e257960..51b418e 100644 --- a/sys/net/bpf_buffer.c +++ b/sys/net/bpf_buffer.c @@ -184,9 +184,9 @@ bpf_buffer_ioctl_sblen(struct bpf_d *d, u_int *i) { u_int size; - BPFD_LOCK(d); + BPFD_WLOCK(d); if (d->bd_bif != NULL) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (EINVAL); } size = *i; @@ -195,7 +195,7 @@ bpf_buffer_ioctl_sblen(struct bpf_d *d, u_int *i) else if (size < BPF_MINBUFSIZE) *i = size = BPF_MINBUFSIZE; d->bd_bufsize = size; - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (0); } diff --git a/sys/net/bpf_zerocopy.c b/sys/net/bpf_zerocopy.c index 60fd76f..1bf0bd6 100644 --- a/sys/net/bpf_zerocopy.c +++ b/sys/net/bpf_zerocopy.c @@ -515,14 +515,14 @@ bpf_zerocopy_ioctl_rotzbuf(struct thread *td, struct bpf_d *d, struct zbuf *bzh; bzero(bz, sizeof(*bz)); - BPFD_LOCK(d); + BPFD_WLOCK(d); if (d->bd_hbuf == NULL && d->bd_slen != 0) { ROTATE_BUFFERS(d); bzh = (struct zbuf *)d->bd_hbuf; bz->bz_bufa = (void *)bzh->zb_uaddr; bz->bz_buflen = d->bd_hlen; } - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (0); } @@ -570,10 +570,10 @@ bpf_zerocopy_ioctl_setzbuf(struct thread *td, struct bpf_d *d, * We only allow buffers to be installed once, so atomically check * that no buffers are currently installed and install new buffers. */ - BPFD_LOCK(d); + BPFD_WLOCK(d); if (d->bd_hbuf != NULL || d->bd_sbuf != NULL || d->bd_fbuf != NULL || d->bd_bif != NULL) { - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); zbuf_free(zba); zbuf_free(zbb); return (EINVAL); @@ -593,6 +593,6 @@ bpf_zerocopy_ioctl_setzbuf(struct thread *td, struct bpf_d *d, * shared management region. */ d->bd_bufsize = bz->bz_buflen - sizeof(struct bpf_zbuf_header); - BPFD_UNLOCK(d); + BPFD_WUNLOCK(d); return (0); } diff --git a/sys/net/bpfdesc.h b/sys/net/bpfdesc.h index d8950b9..9ea4522 100644 --- a/sys/net/bpfdesc.h +++ b/sys/net/bpfdesc.h @@ -87,7 +87,7 @@ struct bpf_d { int bd_sig; /* signal to send upon packet reception */ struct sigio * bd_sigio; /* information for async I/O */ struct selinfo bd_sel; /* bsd select info */ - struct mtx bd_mtx; /* mutex for this descriptor */ + struct rwlock bd_lock; /* per-descriptor lock */ struct callout bd_callout; /* for BPF timeouts with select */ struct label *bd_label; /* MAC label for descriptor */ u_int64_t bd_fcount; /* number of packets which matched filter */ @@ -106,10 +106,19 @@ struct bpf_d { #define BPF_WAITING 1 /* waiting for read timeout in select */ #define BPF_TIMED_OUT 2 /* read timeout has expired in select */ -#define BPFD_LOCK(bd) mtx_lock(&(bd)->bd_mtx) -#define BPFD_UNLOCK(bd) mtx_unlock(&(bd)->bd_mtx) -#define BPFD_LOCK_ASSERT(bd) mtx_assert(&(bd)->bd_mtx, MA_OWNED) +#define BPFD_RLOCK(bd) rw_rlock(&(bd)->bd_lock) +#define BPFD_RUNLOCK(bd) rw_runlock(&(bd)->bd_lock) +#define BPFD_WLOCK(bd) rw_wlock(&(bd)->bd_lock) +#define BPFD_WUNLOCK(bd) rw_wunlock(&(bd)->bd_lock) +#define BPFD_WLOCK_ASSERT(bd) rw_assert(&(bd)->bd_lock, RA_WLOCKED) +#define BPFD_LOCK_ASSERT(bd) rw_assert(&(bd)->bd_lock, RA_LOCKED) +#define BPF_PID_REFRESH(bd, td) (bd)->bd_pid = (td)->td_proc->p_pid +#define BPF_PID_REFRESH_CUR(bd) (bd)->bd_pid = curthread->td_proc->p_pid + +#define BPF_LOCK() mtx_lock(&bpf_mtx) +#define BPF_UNLOCK() mtx_unlock(&bpf_mtx) +#define BPF_LOCK_ASSERT() mtx_assert(&bpf_mtx, MA_OWNED) /* * External representation of the bpf descriptor */ @@ -144,7 +153,9 @@ struct xbpf_d { u_int64_t bd_spare[4]; }; -#define BPFIF_LOCK(bif) mtx_lock(&(bif)->bif_mtx) -#define BPFIF_UNLOCK(bif) mtx_unlock(&(bif)->bif_mtx) +#define BPFIF_RLOCK(bif) rw_rlock(&(bif)->bif_lock) +#define BPFIF_RUNLOCK(bif) rw_runlock(&(bif)->bif_lock) +#define BPFIF_WLOCK(bif) rw_wlock(&(bif)->bif_lock) +#define BPFIF_WUNLOCK(bif) rw_wunlock(&(bif)->bif_lock) #endif diff --git a/sys/security/mac/mac_net.c b/sys/security/mac/mac_net.c index 56ee817..7857749 100644 --- a/sys/security/mac/mac_net.c +++ b/sys/security/mac/mac_net.c @@ -319,6 +319,7 @@ mac_bpfdesc_create_mbuf(struct bpf_d *d, struct mbuf *m) { struct label *label; + /* Assume reader lock is enough. */ BPFD_LOCK_ASSERT(d); if (mac_policy_count == 0) @@ -354,6 +355,7 @@ mac_bpfdesc_check_receive(struct bpf_d *d, struct ifnet *ifp) { int error; + /* Assume reader lock is enough. */ BPFD_LOCK_ASSERT(d); if (mac_policy_count == 0) -- 1.7.9.4 --------------060905060202080703000205--