From owner-freebsd-current@FreeBSD.ORG  Tue Aug  9 02:34:44 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CD2A31065670;
	Tue,  9 Aug 2011 02:34:44 +0000 (UTC)
	(envelope-from okuno.kohji@jp.panasonic.com)
Received: from smtp.mei.co.jp (smtp.mei.co.jp [133.183.100.20])
	by mx1.freebsd.org (Postfix) with ESMTP id 5524A8FC15;
	Tue,  9 Aug 2011 02:34:44 +0000 (UTC)
Received: from mail-gw.jp.panasonic.com ([157.8.1.157])
	by smtp.mei.co.jp (8.12.11.20060614/3.7W/kc-maile13) with ESMTP id
	p792Yh2p027500; Tue, 9 Aug 2011 11:34:43 +0900 (JST)
Received: from epochmail.jp.panasonic.com ([157.8.1.130])
	by mail.jp.panasonic.com (8.11.6p2/3.7W/kc-maili11) with ESMTP id
	p792YhV01536; Tue, 9 Aug 2011 11:34:43 +0900
Received: by epochmail.jp.panasonic.com (8.12.11.20060308/3.7W/lomi13) id
	p792YhBH029842; Tue, 9 Aug 2011 11:34:43 +0900
Received: from localhost
	by lomi13.jp.panasonic.com (8.12.11.20060308/3.7W) with ESMTP id
	p792YgFT029821; Tue, 9 Aug 2011 11:34:42 +0900
Date: Tue, 09 Aug 2011 11:34:42 +0900 (JST)
Message-Id: <20110809.113442.1986759509406197356.okuno.kohji@jp.panasonic.com>
To: kostikbel@gmail.com
From: Kohji Okuno <okuno.kohji@jp.panasonic.com>
In-Reply-To: <20110805171430.GW17489@deviant.kiev.zoral.com.ua>
References: <20110803135044.GM17489@deviant.kiev.zoral.com.ua>
	<20110805154522.GA2054@a91-153-123-205.elisa-laajakaista.fi>
	<20110805171430.GW17489@deviant.kiev.zoral.com.ua>
Organization: Panasonic Corporation
X-Mailer: Mew version 6.3 on Emacs 23.3 / Mule 6.0 (HANACHIRUSATO)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: jh@freebsd.org, freebsd-current@freebsd.org, okuno.kohji@jp.panasonic.com
Subject: Re: Bug: devfs is sure to have the bug.
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Aug 2011 02:34:45 -0000

Hi Kostic and Jaakko,

> On Fri, Aug 05, 2011 at 06:45:22PM +0300, Jaakko Heinonen wrote:
>> On 2011-08-03, Kostik Belousov wrote:
>> > On Wed, Aug 03, 2011 at 02:44:23PM +0900, Kohji Okuno wrote:
>> > > > devfs_populate(), and the context holds only "dm->dm_lock" in
>> > > > devfs_populate().
>> > > > 
>> > > > On the other hand, "devfs_generation" is incremented in devfs_create()
>> > > > and devfs_destroy() the context holds only "devmtx" in devfs_create()
>> > > > and devfs_destroy().
>> > > > 
>> > > > If a context executes devfs_create() when other context is executing
>> > > > (***), then "dm->dm_generation" is updated incorrect value.
>> > > > As a result, we can not open the last detected device (we receive ENOENT).
>> > 
>> > I think the problem you described is real, and suggested change is right.
>> > Initially, I thought that we should work with devfs_generation as with
>> > the atomic type due to unlocked access in the devfs_populate(), but then
>> > convinced myself that this is not needed.
>> > 
>> > But also, I think there is another half of the problem. Namely,
>> > devfs_lookup() calls devfs_populate_vp(), and then does lookup with the
>> > help of devfs_lookupx(). We will miss the generation update
>> > happen after the drop of the dm_lock in devfs_populate_vp() to reacquire
>> > the directory vnode lock.
>> 
>> I don't understand this. devfs_generation is not protected with dm_lock
>> in devfs_create() and devfs_destroy(). On the other hand if you mean
>> that another thread calls devfs_populate() while we drop dm_lock in
>> devfs_populate_vp(), isn't the mount point up to date when we re-lock
>> dm_lock?
> Yes, I was not quite exact in describing what I mean, and the reference
> to dm_lock drop is both vague and not correct.
> 
> I am after the fact that we do allow the situation where it is externally
> visible that new cdev node was successfully created before the lookup
> returns ENOENT for the path of the node.
> 
>> 
>> > @@ -630,13 +630,15 @@ devfs_populate_loop(struct devfs_mount *dm, int cleanup)
>> >  void
>> >  devfs_populate(struct devfs_mount *dm)
>> >  {
>> > +	unsigned gen;
>> >  
>> >  	sx_assert(&dm->dm_lock, SX_XLOCKED);
>> > -	if (dm->dm_generation == devfs_generation)
>> > +	gen = devfs_generation;
>> > +	if (dm->dm_generation == gen)
>> >  		return;
>> >  	while (devfs_populate_loop(dm, 0))
>> >  		continue;
>> > -	dm->dm_generation = devfs_generation;
>> > +	dm->dm_generation = gen;
>> >  }
>> 
>> After this change dm->dm_generation may be stale although the mount
>> point is up to date? This is probably harmless, though.
> This must be harmless, in the worst case it will cause more calls to
> the populate. In fact, this even allows the dm_generation to roll backward,
> which is again harmless.

After all, do we only have to fix only "devfs_populate()"?
I think that there is no problem while I am testing the 8.1-RELEASE
environment that fix only devfs_populate().

Many thanks,
 Kohji Okuno