From owner-cvs-src@FreeBSD.ORG Tue Aug 12 18:40:52 2008 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F11ED1065674; Tue, 12 Aug 2008 18:40:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 8D77F8FC15; Tue, 12 Aug 2008 18:40:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m7CIe1i4081296; Tue, 12 Aug 2008 14:40:02 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: Bruce Evans Date: Tue, 12 Aug 2008 13:44:56 -0400 User-Agent: KMail/1.9.7 References: <200808081343.m78DhwYE068477@repoman.freebsd.org> <200808121115.01483.jhb@freebsd.org> <20080813014505.H1310@besplex.bde.org> In-Reply-To: <20080813014505.H1310@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200808121344.57238.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Tue, 12 Aug 2008 14:40:02 -0400 (EDT) X-Virus-Scanned: ClamAV 0.93.1/8019/Tue Aug 12 13:40:24 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: cvs-src@freebsd.org, src-committers@freebsd.org, Ed Schouten , cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/dev/io iodev.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2008 18:40:53 -0000 On Tuesday 12 August 2008 01:23:47 pm Bruce Evans wrote: > On Tue, 12 Aug 2008, John Baldwin wrote: > > Of course bpf is > > broken with revoke, but nobody uses revoke with bpf. What people do do in > > the normal course of using bpf is lots of concurrent bpf accesses, and w/o > > D_TRACKCLOSE, bpf devices don't get closed. > > Why not fix the actual bug? > > Your commit doesn't give enough details on the actual bug, so I will > try to guess it: libpcap has to probe for for a free bpf unit, so it > does lots of failing opens of bpfN (especially when N == 0) when bpfN > is already in use. Failing opens break last closes with which they > are concurrent, because the relevant reference count (si_usecount) is > increased during the failing open (I think it is the vref() in _fgetvp() > that does it). Then when the opens fail, si_usecount is decremented > to 1, but devfs_close() is not called again because only 1 real last > close is possible (I think -- at least without races with revoke()), > so d_close() is never called twice for 1 real least close. Failing > opens shouldn't take long, so it is surprising that the race is often > lost. Apparently there is some synchronization. Correct-ish. The actual extra reference is in devfs_lookup() rather than _fgetvp(). Specifically, it is an open concurrent with a close. The opening thread first does the lookup which bumps the reference count in devfs_allocv() (IIRC). Eventually we get to devfs_open() which drops the vnode lock while it invokes d_open(). If the closing thread is waiting for the vnode lock for close, then it can acquire the lock and do devfs_close(). This sees count_dev() > 1 and doesn't call d_close(). Meanwhile, the opening thread will fail bpfopen() and return, but the bpf device is permamently open. When I talked with phk@ about it originally his reply to my various suggestions was D_TRACKCLOSE. I'm not sure how you'd really fix this otherwise: calling d_close() from devfs_open() if the use_count is 1 after re-acquiring the vnode lock (you have to drop the vnode lock while you call d_close() though, so you might still race with other concurrent opens for example), having a count of pending opens and subtracting that from count_dev() when checking for whether or not to call d_close() (but is that dubious? Then you might call d_close() while d_open() is running, so the driver would need to put locking to synchronize the two and handle the case that the d_close() actually runs after a d_open() that was successfull, so each driver now has to duplicate that work.), etc. -- John Baldwin