From owner-freebsd-current@freebsd.org Tue Jun 7 14:24:57 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 11A8BB6D44F for ; Tue, 7 Jun 2016 14:24:57 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 9DBDF1DD9; Tue, 7 Jun 2016 14:24:56 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id 8AA20B80A0; Tue, 7 Jun 2016 16:24:52 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id 400AD28494; Tue, 7 Jun 2016 16:24:53 +0200 (CEST) Date: Tue, 7 Jun 2016 16:24:53 +0200 From: Jilles Tjoelker To: Konstantin Belousov Cc: Mark Johnston , freebsd-current@FreeBSD.org, cem@FreeBSD.org Subject: Re: thread suspension when dumping core Message-ID: <20160607142452.GA48251@stack.nl> References: <20160604022347.GA1096@wkstn-mjohnston.west.isilon.com> <20160604093236.GA38613@kib.kiev.ua> <20160606171311.GC10101@wkstn-mjohnston.west.isilon.com> <20160607024610.GI38613@kib.kiev.ua> <20160607041741.GA29017@wkstn-mjohnston.west.isilon.com> <20160607042956.GM38613@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160607042956.GM38613@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2016 14:24:57 -0000 On Tue, Jun 07, 2016 at 07:29:56AM +0300, Konstantin Belousov wrote: > On Mon, Jun 06, 2016 at 09:17:41PM -0700, Mark Johnston wrote: > > Sure, see below. For reference: > > td_flags = 0xa84c = INMEM | SINTR | CANSWAP | ASTPENDING | SBDRY | NEEDSUSPCHK > > td_pflags = 0 > > td_inhibitors = 0x2 = SLEEPING > > td_locks = 0 > > stack: > > mi_switch+0x21e sleepq_catch_signals+0x377 sleepq_wait_sig+0xb _sleep+0x29d ... > > p_flag = 0x10080080 = INMEM | STOPPED_SINGLE | HADTHREADS > > p_flag2 = 0 > > The thread is sleeping interruptibly. The NEEDSUSPCHK flag is set, yet the > > SLEEPABORT flag is not, so the thread can not have been sleeping when > > thread_single() was called - else sleepq_abort() would have been > > invoked and set SLEEPABORT. We are at the second sleepq_switch() call in > > sleepq_catch_signals(), and no signal was pending, so we called > > thread_suspend_check(), which returned 0 because of SBDRY. So we went to > > sleep. > > I note that this couldn't have happened prior to r283320. That change > > was apparently motivated by a similar hang, but in that case the thread > > was suspended (with a vnode lock held) rather than asleep. It looks like > > our internal fix also added a change to set TDF_SBDRY around > > filesystem-specific syscalls, which often sleep interruptibly while > > holding vnode locks. But I don't think that's the problem here, as you > > noted with lf_advlock(). > > With r283320 reverted, P_STOPPED_SIG would not have been set, so > > thread_suspend_check() would have suspended us, allowing the core dump > > to proceed. I had thought that using SINGLE_BOUNDRY beforing coredumping > > would fix both hangs, but I guess that wouldn't help SINGLE_ALLPROC, so > > this is probably the wrong place to be solving the problem. > This looks as if we should not ignore suspension requests in > thread_suspend_check() completely in TDF_SBDRY case, but return either > EINTR or ERESTART (most likely ERESTART). Note that the goal of > TDF_SBDRY is to avoid suspending in the protected region, not to make an > impression that the suspension does not occur at all. This looks like it would revert r246417 and re-introduce the bug fixed by it (unexpected [EINTR] and short reads/writes after stop signals). After r246417, TDF_SBDRY is intended for sleeps that occur while holding resources such as vnode locks and are normally short but should be interruptible by fatal signals because they may occasionally be indefinitely long (such as a non-responsive NFS server). It looks like yet another kind of sleep may be required, since advisory locks still hold some filesystem resources across the sleep (though not vnode locks). We then have four kinds: * uninterruptible by signals, ignores stops (default) * interruptible by signals, ignores stops (current TDF_SBDRY with PCATCH) * interruptible by signals, freezes in place on stops (avoids unexpected short I/O) (current PCATCH, otherwise) * interruptible by signals, fails with [ERESTART] on stops (avoids holding resources across a stop) (new) The new kind of sleep would fail with [ERESTART] only for stops, since [EINTR] should only be returned if a signal handler was called. There cannot be a signal handler since a SIGTSTP/SIGTTIN/SIGTTOU signal with a handler does not stop the process. -- Jilles Tjoelker