Date: Sat, 9 Aug 2014 00:49:56 -0700 From: Adrian Chadd <adrian@freebsd.org> To: Chris Torek <torek@torek.net> Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: crash in bpf catchpacket() code Message-ID: <CAJ-VmonfSfsU1zwN5t-zgwbFeQm7zV1ikihYHzryjv3nWseBmQ@mail.gmail.com> In-Reply-To: <201408090131.s791VeDx049988@elf.torek.net> References: <201408090131.s791VeDx049988@elf.torek.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi! Would you mind submitting a PR for this? You've done all the great work needed to chase this down; I'd hate for it to be forgotten! -a On 8 August 2014 18:31, Chris Torek <torek@torek.net> wrote: > We saw a crash with this stack trace (I snipped out the "frame"s > here as not very useful to others, plus there's no full dump, just > a text dump backtrace and some other bits): > > bcopy() at bcopy+0x16 > bpf_mtap() at bpf_mtap+0x1d0 > ether_nh_input() at ether_nh_input+0x167 > netisr_dispatch_src() at netisr_dispatch_src+0x5e > igb_rxeof() at igb_rxeof+0x56d > igb_msix_que() at igb_msix_que+0x101 > > The bpf_mtap() pc is actually in (inlined) catchpacket(), where > we do: > > bpf_append_bytes(d, d->bd_sbuf, curlen, ...) > > where the bd_bufmode is BPF_BUFMODE_BUFFER so this becomes > bpf_buffer_append_bytes() which becomes bcopy(), and it appears > the whole mess of calls has been inlined. > > The crash clearly has a NULL d->bd_sbuf: > > bcopy+0x16: repe movsq (%rsi),%es:(%rdi) > > rsi 0xfffff82044dbd768 > rdi 0 > > (%es is not very interesting). This means d->bd_sbuf was NULL, > which was a bit of a mystery, as the code path starts with a lock > assertion: > > BPFD_LOCK_ASSERT(d); > > and then has this bit: > > if (d->bd_fbuf == NULL) { > /* > * There's no room in the store buffer, and no > * prospect of room, so drop the packet. Notify > * the > * buffer model. > */ > bpf_buffull(d); > ++d->bd_dcount; > return; > } > > before doing this: > > ROTATE_BUFFERS(d); > > (the ROTATE causes bd_fbuf to move into bd_sbuf). > > But then I realized that it did this extra step in between these > "test fbuf for NULL" and "ROTATE" steps: > > while (d->bd_hbuf_in_use) > mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, > PRINET, "bd_hbuf", 0); > > So, if we assume that the hbuf (hold buffer) *is* in use, we'll > sleep waiting for the user to finish with it and then wake us up. > While we're asleep, we'll give up d->bd_lock. > > It sure looks like someone else (some other bpf consumer using the > same d->bd_* fields) snuck in while we were asleep and used up > d->bd_fbuf, setting it to NULL. Then we got the lock and did our > own ROTATE_BUFFERS() and moved the NULL to d->bd_sbuf. > > If this analysis is correct, the fix is simply to wait for the > hbuf to be available *before* checking d->bd_fbuf, i.e., move the > while loop up. > > Because the bug is (apparently) really hard to hit (it's not like > we can reproduce this at will), I'm not 100% convinced this is the > fix (or the entire fix). But here it is in patch form... > > Chris > > --- > > bpf: check d->bd_fbuf after sleep, not before > > The code in catchpacket() checks that there's a valid d->bd_fbuf > before doing a ROTATE_BUFFERS, which will move the sbuf (store > buffer) to the hbuf (hold buffer) and make the fbuf (free buffer) > become the sbuf. OK so far, but *after* it verifies this fbuf, > it then mtx_sleep-waits for the hold buffer to be available. If > the fbuf goes NULL during the wait when we drop the lock, there > will be no sbuf once we do the ROTATE_BUFFERS. > > To fix this, simply check fbuf after possibly waiting for hbuf, > rather than before. > > > diff --git a/sys/net/bpf.c b/sys/net/bpf.c > --- a/sys/net/bpf.c > +++ b/sys/net/bpf.c > @@ -2352,6 +2352,9 @@ catchpacket(struct bpf_d *d, u_char *pkt > #endif > curlen = BPF_WORDALIGN(d->bd_slen); > if (curlen + totlen > d->bd_bufsize || !bpf_canwritebuf(d)) { > + while (d->bd_hbuf_in_use) > + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, > + PRINET, "bd_hbuf", 0); > if (d->bd_fbuf == NULL) { > /* > * There's no room in the store buffer, and no > @@ -2362,9 +2365,6 @@ catchpacket(struct bpf_d *d, u_char *pkt > ++d->bd_dcount; > return; > } > - while (d->bd_hbuf_in_use) > - mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, > - PRINET, "bd_hbuf", 0); > ROTATE_BUFFERS(d); > do_wakeup = 1; > curlen = 0; > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonfSfsU1zwN5t-zgwbFeQm7zV1ikihYHzryjv3nWseBmQ>