From owner-freebsd-current@freebsd.org Tue Jun 7 21:19:23 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BDADBB6E11B for ; Tue, 7 Jun 2016 21:19:23 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay02.stack.nl [IPv6:2001:610:1108:5010::104]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 593AD12F6; Tue, 7 Jun 2016 21:19:23 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id E2E90358C54; Tue, 7 Jun 2016 23:19:19 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id BE89228494; Tue, 7 Jun 2016 23:19:19 +0200 (CEST) Date: Tue, 7 Jun 2016 23:19:19 +0200 From: Jilles Tjoelker To: Konstantin Belousov Cc: Mark Johnston , freebsd-current@FreeBSD.org, cem@FreeBSD.org Subject: Re: thread suspension when dumping core Message-ID: <20160607211919.GA49961@stack.nl> References: <20160604022347.GA1096@wkstn-mjohnston.west.isilon.com> <20160604093236.GA38613@kib.kiev.ua> <20160606171311.GC10101@wkstn-mjohnston.west.isilon.com> <20160607024610.GI38613@kib.kiev.ua> <20160607041741.GA29017@wkstn-mjohnston.west.isilon.com> <20160607042956.GM38613@kib.kiev.ua> <20160607142452.GA48251@stack.nl> <20160607160155.GP38613@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160607160155.GP38613@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2016 21:19:23 -0000 On Tue, Jun 07, 2016 at 07:01:55PM +0300, Konstantin Belousov wrote: > On Tue, Jun 07, 2016 at 04:24:53PM +0200, Jilles Tjoelker wrote: > > On Tue, Jun 07, 2016 at 07:29:56AM +0300, Konstantin Belousov wrote: > > > This looks as if we should not ignore suspension requests in > > > thread_suspend_check() completely in TDF_SBDRY case, but return either > > > EINTR or ERESTART (most likely ERESTART). Note that the goal of > > > TDF_SBDRY is to avoid suspending in the protected region, not to make an > > > impression that the suspension does not occur at all. > > This looks like it would revert r246417 and re-introduce the bug fixed > > by it (unexpected [EINTR] and short reads/writes after stop signals). > Well, the patch returns ERESTART and not EINTR, so the syscall should > be retried after all the unwinding. That fixes the [EINTR] part of the problem but not short reads and writes. > > After r246417, TDF_SBDRY is intended for sleeps that occur while holding > > resources such as vnode locks and are normally short but should be > > interruptible by fatal signals because they may occasionally be > > indefinitely long (such as a non-responsive NFS server). > > It looks like yet another kind of sleep may be required, since advisory > > locks still hold some filesystem resources across the sleep (though not > > vnode locks). > I do not think that adv locks enter sleep with any resource held which > would block other threads. But I agree with the statement because the > lock might be granted and then the stopped thread would appear to own > the blocking resource. It does not hold any resource used by normal operations, but it blocks a forced unmount (umount -f hangs in [purgelocks] with tmpfs in a recent stable/10). If queuing is supposed to be fair, then granting the lock to the stopped thread is correct and aborting the sleep with [ERESTART] would break it. The kern_lockf.c code seems to go to some lengths to make queuing fair. This does not seem very important, though. Also, restarting a locking call violates some text in POSIX XSH's fcntl page that the range of bytes to be locked shall be determined before the thread blocks (this may be affected by the current seek offset and the file size). I don't know whether violating this will break any applications. The text has the problem that there is no way to distinguish between a thread that is in fcntl() and has not yet blocked and a thread that has blocked, even though it seems intuitively clear. > > We then have four kinds: > > * uninterruptible by signals, ignores stops (default) > > * interruptible by signals, ignores stops (current TDF_SBDRY with > > PCATCH) > > * interruptible by signals, freezes in place on stops (avoids > > unexpected short I/O) (current PCATCH, otherwise) > > * interruptible by signals, fails with [ERESTART] on stops (avoids > > holding resources across a stop) (new) > > The new kind of sleep would fail with [ERESTART] only for stops, since > > [EINTR] should only be returned if a signal handler was called. There > > cannot be a signal handler since a SIGTSTP/SIGTTIN/SIGTTOU signal with a > > handler does not stop the process. > And where would this new kind of sleep used ? The advlock sleep is the one > place. Does fifo sleep for reader or writer on open require this kind > of handling (IMO no) ? > I think this can be relatively easily implemented with either a flag > for XXXsleep(9) (my older style of PBDRY) or using only the thread flag > (jhb' newer TDF_SBDRY approach). Probably the later should be used, for > consistency and easier marking of larger blocks of code. In this case it is clear which sleep(9) calls should be affected so it may be better to avoid more hidden state. I also wonder whether we may be overengineering things here. Perhaps the advlock sleep can simply turn off TDF_SBDRY. -- Jilles Tjoelker