From owner-freebsd-bugs@FreeBSD.ORG Sun Aug 10 18:56:08 2014 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6E9185F3 for ; Sun, 10 Aug 2014 18:56:08 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 55CE623F6 for ; Sun, 10 Aug 2014 18:56:08 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.8/8.14.8) with ESMTP id s7AIu86X093758 for ; Sun, 10 Aug 2014 18:56:08 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 192558] New: [patch] hard-to-hit crash in bpf catchpacket() Date: Sun, 10 Aug 2014 18:56:08 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.2-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: torek@torek.net X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 18:56:08 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192558 Bug ID: 192558 Summary: [patch] hard-to-hit crash in bpf catchpacket() Product: Base System Version: 9.2-STABLE Hardware: Any OS: Any Status: Needs Triage Severity: Affects Only Me Priority: Normal Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: torek@torek.net The symptom is simply a crash that looks like a kernel NULL pointer bug, with the program counter in bcopy(): Fatal trap 12: page fault while in kernel mode cpuid = 8; apic id = 08 fault virtual address = 0x0 fault code = supervisor write data, page not present [snip] The stack trace in ddb looks like this (I stripped out the frames - I don't have a full dump, we have that disabled for various mostly-good reasons, so the raw frames are not really useful here): bcopy() at bcopy+0x16 bpf_mtap() at bpf_mtap+0x1d0 ether_nh_input() at ether_nh_input+0x167 netisr_dispatch_src() at netisr_dispatch_src+0x5e igb_rxeof() at igb_rxeof+0x56d igb_msix_que() at igb_msix_que+0x101 ... although of course it will be some other rxeof() if you're on some other network device. The bpf_mtap() pc is actually in (inlined) catchpacket(), where we do: bpf_append_bytes(d, d->bd_sbuf, curlen, ...) where the bd_bufmode is BPF_BUFMODE_BUFFER (it must be, we have not used the sysctl that enables the zero copy stuff) so this becomes bpf_buffer_append_bytes() which becomes bcopy(), and it appears the whole mess of calls has been inlined. The crash clearly has a NULL d->bd_sbuf: bcopy+0x16: repe movsq (%rsi),%es:(%rdi) rsi 0xfffff82044dbd768 rdi 0 (%es is not very interesting). This means d->bd_sbuf was NULL, which was a bit of a mystery, as the code path starts with a lock assertion: BPFD_LOCK_ASSERT(d); and then has this bit: if (d->bd_fbuf == NULL) { /* * There's no room in the store buffer, and no * prospect of room, so drop the packet. Notify * the * buffer model. */ bpf_buffull(d); ++d->bd_dcount; return; } before doing this: ROTATE_BUFFERS(d); (the ROTATE causes bd_fbuf to move into bd_sbuf). But then I realized that it did this extra step in between these "test fbuf for NULL" and "ROTATE" steps: while (d->bd_hbuf_in_use) mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, PRINET, "bd_hbuf", 0); So, if we assume that the hbuf (hold buffer) *is* in use, we'll sleep waiting for the user to finish with it and then wake us up. While we're asleep, we'll give up d->bd_lock. It sure looks like someone else (some other bpf consumer using the same d->bd_* fields) snuck in while we were asleep and used up d->bd_fbuf, setting it to NULL. Then we got the lock and did our own ROTATE_BUFFERS() and moved the NULL to d->bd_sbuf. If this analysis is correct, the fix is simply to wait for the hbuf to be available *before* checking d->bd_fbuf, i.e., move the while loop up. Environment: System: FreeBSD 9.2-STABLE #0: Sat Aug 9 20:01:00 PDT 2014: root@builder:/usr/obj/usr/src/sys/XXX amd64 Note, this applies to 9.x and 10.x -- the code is the same in both. How-To-Repeat: Difficult. I don't have an actual reproducer. The crash happened just once and I did the analysis based on stack trace and panic, etc. Clearly you need to have two or more threads/procs using bpf, so that one of them can have hbuf in use (during a copyout()). You also need to have lots of traffic flying on the Ethernet (so that the buffers fill). I know the system has dhcpd running, not sure what other things are running at the time that have promiscuous mode enabled on the igb device (but promisc is enabled). Fix: I'm not 100% convinced this is the fix, or the entire fix. But it seems correct and is at least a step to fixing. Here it is in patch form... BTW, I note that there is a similar loop for the zero copy code, waiting for hbuf. It may need to repeat its test after mtx_sleeping. I did not look closely at the zero copy path after I verified we were not using it. It might also be wise to recheck curlen-vs-totlen, etc. Maybe we should just start the whole routine over if we sleep, for each hbuf sleep case. But I think with this change, the not-zero-copy code will always work, albeit possibly in sub-optimal ways in this rare case. --- bpf: check d->bd_fbuf after sleep, not before The code in catchpacket() checks that there's a valid d->bd_fbuf before doing a ROTATE_BUFFERS, which will move the sbuf (store buffer) to the hbuf (hold buffer) and make the fbuf (free buffer) become the sbuf. OK so far, but *after* it verifies this fbuf, it then mtx_sleep-waits for the hold buffer to be available. If the fbuf goes NULL during the wait when we drop the lock, there will be no sbuf once we do the ROTATE_BUFFERS. To fix this, simply check fbuf after possibly waiting for hbuf, rather than before. diff --git a/sys/net/bpf.c b/sys/net/bpf.c --- a/sys/net/bpf.c +++ b/sys/net/bpf.c @@ -2352,6 +2352,9 @@ catchpacket(struct bpf_d *d, u_char *pkt #endif curlen = BPF_WORDALIGN(d->bd_slen); if (curlen + totlen > d->bd_bufsize || !bpf_canwritebuf(d)) { + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, + PRINET, "bd_hbuf", 0); if (d->bd_fbuf == NULL) { /* * There's no room in the store buffer, and no @@ -2362,9 +2365,6 @@ catchpacket(struct bpf_d *d, u_char *pkt ++d->bd_dcount; return; } - while (d->bd_hbuf_in_use) - mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, - PRINET, "bd_hbuf", 0); ROTATE_BUFFERS(d); do_wakeup = 1; curlen = 0; _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" -- You are receiving this mail because: You are the assignee for the bug.