From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 10 19:01:09 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 16F2575B; Sun, 10 Aug 2014 19:01:09 +0000 (UTC) Received: from elf.torek.net (50-73-42-1-utah.hfc.comcastbusiness.net [50.73.42.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E0BE32496; Sun, 10 Aug 2014 19:01:08 +0000 (UTC) Received: from elf.torek.net (localhost [127.0.0.1]) by elf.torek.net (8.14.5/8.14.5) with ESMTP id s7AJ1780006441; Sun, 10 Aug 2014 13:01:07 -0600 (MDT) (envelope-from torek@elf.torek.net) Received: (from torek@localhost) by elf.torek.net (8.14.5/8.14.5/Submit) id s7AJ17Lu006439; Sun, 10 Aug 2014 13:01:07 -0600 (MDT) (envelope-from torek) Date: Sun, 10 Aug 2014 13:01:07 -0600 (MDT) From: Chris Torek Message-Id: <201408101901.s7AJ17Lu006439@elf.torek.net> To: adrian@freebsd.org Subject: Re: crash in bpf catchpacket() code In-Reply-To: X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (elf.torek.net [127.0.0.1]); Sun, 10 Aug 2014 13:01:07 -0600 (MDT) X-Mailman-Approved-At: Sun, 10 Aug 2014 19:47:56 +0000 Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 19:01:09 -0000 >Would you mind submitting a PR for this? You've done all the great >work needed to chase this down; I'd hate for it to be forgotten! Sent. I expanded a bit more on some thoughts about the other mtx_sleep case in the code (in the zero copy stuff); the patch I gave may leave a bug there, and is probably sub-optimal (I was going for a minimal change that would keep our system from crashing :-) ). (Although a "goto restart" after mtx_sleep'ing would also be not exactly optimal, as we'd redo a bunch of effectively constant work. Anyway the root of the bug appears to be that mtx_sleep drops our bpf_d descriptor lock and the code assumes we hold it throughout.) Chris