Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Oct 2009 01:08:08 +0200
From:      Giorgos Keramidas <keramida@freebsd.org>
To:        Konstantin Belousov <kib@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r198590 - head/sys/kern
Message-ID:  <871vklbxyf.fsf@kobe.laptop>
In-Reply-To: <200910291434.n9TEYOVJ099388@svn.freebsd.org> (Konstantin Belousov's message of "Thu, 29 Oct 2009 14:34:24 %2B0000 (UTC)")
References:  <200910291434.n9TEYOVJ099388@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 29 Oct 2009 14:34:24 +0000 (UTC), Konstantin Belousov <kib@FreeBSD.org> wrote:
> Author: kib
> Date: Thu Oct 29 14:34:24 2009
> New Revision: 198590
> URL: http://svn.freebsd.org/changeset/base/198590
>
> Log:
>   Trapsignal() calls kern_sigprocmask() when delivering catched signal
>   with proc lock held.
>
>   Reported and tested by:	Mykola Dzham  freebsd at levsha org ua
>   MFC after:	1 month

Hi Konstantin,

Some of the recent kern_sig changes end up recursing on a non-recursive
mutex in kern_sigprocmask() -> reschedule_signals():

panic: _mtx_lock_sleep: recursed on non-recursive mutex sigacts @ /usr/src/sys/kern/kern_sig.c:2422
(kgdb) bt
#0  doadump () at pcpu.h:246
#1  0xc0680bee in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416
#2  0xc0680ec2 in panic (fmt=Variable "fmt" is not available.
    ) at /usr/src/sys/kern/kern_shutdown.c:579
#3  0xc06716ea in _mtx_lock_sleep (m=0xc8154aa8, tid=3332925072, opts=0, file=0xc09bb332 "/usr/src/sys/kern/kern_sig.c", line=2422)
    at /usr/src/sys/kern/kern_mutex.c:341
#4  0xc0671907 in _mtx_lock_flags (m=0xc8154aa8, opts=0, file=0xc09bb332 "/usr/src/sys/kern/kern_sig.c", line=2422)
    at /usr/src/sys/kern/kern_mutex.c:203
#5  0xc0683434 in reschedule_signals (p=0xc71172a8, block={__bits = {0, 0, 0, 0}}) at /usr/src/sys/kern/kern_sig.c:2422
#6  0xc0683751 in kern_sigprocmask (td=0xc6a86690, how=1, set=0xe9005cd4, oset=0x0, flags=2) at /usr/src/sys/kern/kern_sig.c:1027
#7  0xc0684801 in postsig (sig=20) at /usr/src/sys/kern/kern_sig.c:2743
#8  0xc06be228 in ast (framep=0xe9005d38) at /usr/src/sys/kern/subr_trap.c:234
#9  0xc0920624 in doreti_ast () at /usr/src/sys/i386/i386/exception.s:365

I think the change that started causing this for MT applications was
change 197963 in /head that added this bit of code in kern_sig.c inside
kern_sigprocmask():

: @@ -1012,7 +1012,20 @@
:                          break;
:                  }
:          }
: -        PROC_UNLOCK(td->td_proc);
: +        /*
: +         * The new_block set contains signals that were not previosly
: +         * blocked, but are blocked now.
: +         *
: +         * In case we block any signal that was not previously blocked
: +         * for td, and process has the signal pending, try to schedule
: +         * signal delivery to some thread that does not block the signal,
: +         * possibly waking it up.
: +         */
: +        if (p->p_numthreads != 1)
: +                reschedule_signals(p, new_block);
: +
: +        PROC_UNLOCK(p);
:          return (error);

AFAICT, postsig() is called with proc->p_sigacts->ps_mtx locked, so when
we are recursing when reschedule_signals() tries to lock it once more.

Since we are holding the proc lock in kern_sigprocmask(), is it safe to
assert that we own ps_mtx, drop it and re-acquire it immediately after
calling reschedule_signals()?




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?871vklbxyf.fsf>