From owner-freebsd-current@freebsd.org Wed Jun 8 13:56:46 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 143A7B6E487 for ; Wed, 8 Jun 2016 13:56:46 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A0B2D1AB5; Wed, 8 Jun 2016 13:56:45 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u58DuaUJ064292 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 8 Jun 2016 16:56:36 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u58DuaUJ064292 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u58DuZ6I064291; Wed, 8 Jun 2016 16:56:35 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 8 Jun 2016 16:56:35 +0300 From: Konstantin Belousov To: Mark Johnston Cc: Jilles Tjoelker , freebsd-current@FreeBSD.org, cem@FreeBSD.org Subject: Re: thread suspension when dumping core Message-ID: <20160608135635.GY38613@kib.kiev.ua> References: <20160604093236.GA38613@kib.kiev.ua> <20160606171311.GC10101@wkstn-mjohnston.west.isilon.com> <20160607024610.GI38613@kib.kiev.ua> <20160607041741.GA29017@wkstn-mjohnston.west.isilon.com> <20160607042956.GM38613@kib.kiev.ua> <20160607142452.GA48251@stack.nl> <20160607160155.GP38613@kib.kiev.ua> <20160607211919.GA49961@stack.nl> <20160608043055.GV38613@kib.kiev.ua> <20160608133508.GA93263@charmander> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160608133508.GA93263@charmander> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Jun 2016 13:56:46 -0000 On Wed, Jun 08, 2016 at 06:35:08AM -0700, Mark Johnston wrote: > On Wed, Jun 08, 2016 at 07:30:55AM +0300, Konstantin Belousov wrote: > > On Tue, Jun 07, 2016 at 11:19:19PM +0200, Jilles Tjoelker wrote: > > > I also wonder whether we may be overengineering things here. Perhaps > > > the advlock sleep can simply turn off TDF_SBDRY. > > Well, this was the very first patch suggested. I would be fine with that, > > but again, out-of-tree code seems to be not quite fine with that local > > solution. > > In our particular case, we could possibly use a similar approach. In > general, it seems incorrect to clear TDF_SBDRY if the thread calling > sx_sleep() has any locks held. It is easy to verify that all callers of > lf_advlock() are safe in this respect, but this kind of auditing is > generally hard. In fact, I believe the sx_sleep that led to the problem > described in D2612 is the same as the one in my case. That is, the > sleeping thread may or may not hold a vnode lock depending on context. I do not think that in-tree code sleeps with a vnode lock held in the lf_advlock(). Otherwise, system would hang in lock cascade by an attempt to obtain an advisory lock. I think we can even assert this with witness. There is another sleep, which Jilles mentioned, in lf_purgelocks(), called from vgone(). This sleep indeed occurs under the vnode lock, and as such must be non-suspendable. The sleep waits until other threads leave the lf_advlock() for the reclaimed vnode, and they should leave in deterministic time due to issued wakeups. So this sleep is exempt from the considerations, and TDF_SBDRY there is correct. I am fine with either the braces around sx_sleep() in lf_advlock() to clear TDF_SBDRY (sigdeferstsop()), or with the latest patch I sent, which adds temporal override for TDF_SBDRY with TDF_SRESTART. My understanding is that you prefer the later. If I do not mis-represent your position, I understand why you do prefer that.