From owner-freebsd-fs@FreeBSD.ORG Wed Mar 18 10:44:48 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 951565DC for ; Wed, 18 Mar 2015 10:44:48 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1CA4626B for ; Wed, 18 Mar 2015 10:44:47 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t2IAig1B076303 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 18 Mar 2015 12:44:42 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t2IAig1B076303 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t2IAigD2076301; Wed, 18 Mar 2015 12:44:42 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 18 Mar 2015 12:44:42 +0200 From: Konstantin Belousov To: Mateusz Guzik Subject: Re: atomic v_usecount and v_holdcnt Message-ID: <20150318104442.GS2379@kib.kiev.ua> References: <20141122002812.GA32289@dft-labs.eu> <20141122092527.GT17068@kib.kiev.ua> <20141122211147.GA23623@dft-labs.eu> <20141124095251.GH17068@kib.kiev.ua> <20150314225226.GA15302@dft-labs.eu> <20150316094643.GZ2379@kib.kiev.ua> <20150317014412.GA10819@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150317014412.GA10819@dft-labs.eu> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Mar 2015 10:44:48 -0000 On Tue, Mar 17, 2015 at 02:44:12AM +0100, Mateusz Guzik wrote: > On Mon, Mar 16, 2015 at 11:46:43AM +0200, Konstantin Belousov wrote: > > On Sat, Mar 14, 2015 at 11:52:26PM +0100, Mateusz Guzik wrote: > > > On Mon, Nov 24, 2014 at 11:52:52AM +0200, Konstantin Belousov wrote: > > > > On Sat, Nov 22, 2014 at 10:11:47PM +0100, Mateusz Guzik wrote: > > > > > On Sat, Nov 22, 2014 at 11:25:27AM +0200, Konstantin Belousov wrote: > > > > > > On Sat, Nov 22, 2014 at 01:28:12AM +0100, Mateusz Guzik wrote: > > > > > > > The idea is that we don't need an interlock as long as we don't > > > > > > > transition either counter 1->0 or 0->1. > > > > > > I already said that something along the lines of the patch should work. > > > > > > In fact, you need vnode lock when hold count changes between 0 and 1, > > > > > > and probably the same for use count. > > > > > > > > > > > > > > > > I don't see why this would be required (not that I'm an VFS expert). > > > > > vnode recycling seems to be protected with the interlock. > > > > > > > > > > In fact I would argue that if this is really needed, current code is > > > > > buggy. > > > > Yes, it is already (somewhat) buggy. > > > > > > > > Most need of the lock is for the case of counts coming from 1 to 0. > > > > The reason is the handling of the active vnode list, which is used > > > > for limiting the amount of vnode list walking in syncer. When hold > > > > count is decremented to 0, vnode is removed from the active list. > > > > When use count is decremented to 0, vnode is supposedly inactivated, > > > > and vinactive() cleans the cached pages belonging to vnode. In other > > > > words, VI_OWEINACT for dirty vnode is sort of bug. > > > > > > > > > > Modified the patch to no longer have the usecount + interlock dropped + > > > VI_OWEINACT set window. > > > > > > Extended 0->1 hold count + vnode not locked window remains. I can fix > > > that if it is really necessary by having _vhold return with interlock > > > held if it did such transition. > > > > In v_upgrade_usecount(), you call v_incr_devcount() without without interlock > > held. What prevents the devfs vnode from being recycled, in particular, > > from invalidation of v_rdev pointer ? > > > > Right, that was buggy. Fixed in the patch below. Why non-atomicity of updates to several counters is safe ? This at least requires an explanation in the comment, I mean holdcnt/usecnt pair. Assume the thread increased the v_usecount, but did not managed to acquire dev_mtx. Another thread performs vrele() and progressed to v_decr_devcount(). It decreases the si_usecount, which might allow yet another thread to see the si_usecount as too low and start unwanted action. I think that the tests for VCHR must be done at the very start of the functions, and devfs vnodes must hold vnode interlock unconditionally. > > > I think that refcount_acquire_if_greater() KPI is excessive. You always > > calls acquire with val == 0, and release with val == 1. > > > > Yea i noted in my prevoius e-mail it should be changed (see below). > > I replaced them with refcount_acquire_if_not_zero and > refcount_release_if_not_last. I dislike the length of the names. Can you propose something shorter ? The type for the local variable old in both functions should be u_int. > > > WRT to _refcount_release_lock, why is lock_object->lc_lock/lc_unlock KPI > > cannot be used ? This allows to make refcount_release_lock() a function > > instead of gcc extension macros. Not to mention that the macro is unused. > > These were supposed to be used by other code, forgot to remove it from > the patch I sent here. > > We can discuss this in another thread. > > Striclty speaking we could use it here for vnode interlock, but I did > not want to get around VI_LOCK macro (which right now is just a > mtx_lock, but this may change). > > Updated patch is below: Do not introduce ASSERT_VI_LOCK, the name difference between ASSERT_VI_LOCKED and ASSERT_VI_LOCK is only in the broken grammar. I do not see anything wrong with explicit if() statements where needed, in all four places. In vputx(), wrap the long line (if (refcount_release() || VI_DOINGINACT)).