From owner-freebsd-stable@FreeBSD.ORG Sun Jan 17 19:53:20 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 631411065696; Sun, 17 Jan 2010 19:53:20 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (skuns.zoral.com.ua [91.193.166.194]) by mx1.freebsd.org (Postfix) with ESMTP id A4F8B8FC14; Sun, 17 Jan 2010 19:53:19 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o0HJr8hM008452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 17 Jan 2010 21:53:08 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id o0HJr8Bx043510; Sun, 17 Jan 2010 21:53:08 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id o0HJr7Af043509; Sun, 17 Jan 2010 21:53:07 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 17 Jan 2010 21:53:07 +0200 From: Kostik Belousov To: David Xu Message-ID: <20100117195307.GJ62907@deviant.kiev.zoral.com.ua> References: <4B4D0293.3040704@rogers.com> <201001140941.46748.tijl@coosemans.org> <4B4FC56A.4020007@freebsd.org> <201001161451.39399.tijl@coosemans.org> <4B526C69.6050300@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="SR0DFWOzPg+ymaCV" Content-Disposition: inline In-Reply-To: <4B526C69.6050300@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Gardner Bell , Tijl Coosemans , freebsd-stable@freebsd.org Subject: Re: process in STOP state X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2010 19:53:20 -0000 --SR0DFWOzPg+ymaCV Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jan 17, 2010 at 09:48:25AM +0800, David Xu wrote: > Tijl Coosemans wrote: > >On Friday 15 January 2010 02:31:22 David Xu wrote: > > =20 > >>Tijl Coosemans wrote: > >> =20 > >>>>Besides weird formatting of procstat -k output, I do not see > >>>>anything wrong in the state of the process. It got SIGSTOP, I am > >>>>sure. Attaching gdb helps because debugger gets signal reports > >>>>instead of target process getting the signal actions on signal > >>>>delivery. > >>>> > >>>>The only question is why the process gets SIGSTOP at all. > >>>> =20 > >>>Wine uses ptrace(2) sometimes. The SIGSTOP could have come from > >>>that. I recently submitted > >>>http://www.freebsd.org/cgi/query-pr.cgi?pr=3D142757 describing a > >>>problem with ptrace and signals, so you might want to give the > >>>kernel patch a try. > >>> =20 > >>The problem in your patch is that ksi pointer can not be hold across > >>thread sleeping, because once the process is resumed, there is no > >>guarantee that the thread will run first, once the signal came from > >>process's signal queue, other threads can remove the signal, and here > >>your sigqueue_take(ksi) is dangerous code. > >> =20 > > > >If other threads can run before the current thread then there's a > >second problem next to the one in the PR (current thread deletes > >signal that shouldn't be deleted). Then those other threads can see > >that the SIGSTOP bit (or another signal) is still set and stop the > >process a second time. This might be what happens in the OP's case. > > > >So, the signal has to be cleared before suspending the process, but > >then other threads can still deliver other signals which might change > >delivery order and I don't see any way around that besides introducing > >a per process signal lock that is also kept while the process is > >stopped. Comments? > > > > =20 > I don't have an idea now, we ever delivered signal to thread's queue, > though it may lose signal if thread exits, but it does not have the probl= em > you have described here, we may need to rethink how to fix the signal-lost > problem but still deliver signal to thread's queue, just an idea. >=20 I do think that signal shall be removed from the queue before ptracestop(), and I do not see a problem with other threads taking other signals. ptracestop() stops the whole process, and delivery order when signals go to different threads is somewhat vague due to scheduling events etc. So even if we guarantee that thread goes to userland for signal delivery before another thread, that another thread may still run handler observable earlier due to scheduler, swapping etc. I think your fix might be slightly reworked, making three points: 1. do remove signal from queue, and reinsert it into the _head_ of the _thread_ queue after. We already got it from the head, and current thread did not masked the signal. Since thread cannot change the mask in between, this signal is the first to be delivered to this thread, and current thread is still eligible for delivery. I added KSI_HEAD flag to adjust behaviour of sigqueue_add(). 2. Instead of adding signal to sq_kill, I just call sigqueue_add with NULL ksi. 3. To remove ksi from the queue, but not freeing it, I had to copy/paste the sigqueue_get() into sigqueue_steal(). It returns pointer to the removed ksi. I did not found good way to merge _get and steal. diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index 1c21bc5..80e5a07 100644 --- a/sys/kern/kern_sig.c +++ b/sys/kern/kern_sig.c @@ -316,6 +316,42 @@ sigqueue_get(sigqueue_t *sq, int signo, ksiginfo_t *si) return (signo); } =20 +static int +sigqueue_steal(sigqueue_t *sq, int signo, ksiginfo_t **si) +{ + struct proc *p; + struct ksiginfo *ksi, *next; + int count; + + KASSERT(sq->sq_flags & SQ_INIT, ("sigqueue not inited")); + p =3D sq->sq_proc; + count =3D 0; + + if (!SIGISMEMBER(sq->sq_signals, signo)) + return (0); + + if (SIGISMEMBER(sq->sq_kill, signo)) { + count++; + SIGDELSET(sq->sq_kill, signo); + } + + TAILQ_FOREACH_SAFE(ksi, &sq->sq_list, ksi_link, next) { + if (ksi->ksi_signo =3D=3D signo) { + if (count =3D=3D 0) { + TAILQ_REMOVE(&sq->sq_list, ksi, ksi_link); + ksi->ksi_sigq =3D NULL; + *si =3D ksi; + } + if (++count > 1) + break; + } + } + + if (count <=3D 1) + SIGDELSET(sq->sq_signals, signo); + return (signo); +} + void sigqueue_take(ksiginfo_t *ksi) { @@ -357,7 +393,10 @@ sigqueue_add(sigqueue_t *sq, int signo, ksiginfo_t *si) =20 /* directly insert the ksi, don't copy it */ if (si->ksi_flags & KSI_INS) { - TAILQ_INSERT_TAIL(&sq->sq_list, si, ksi_link); + if (si->ksi_flags & KSI_HEAD) + TAILQ_INSERT_HEAD(&sq->sq_list, si, ksi_link); + else + TAILQ_INSERT_TAIL(&sq->sq_list, si, ksi_link); si->ksi_sigq =3D sq; goto out_set_bit; } @@ -2492,6 +2531,7 @@ issignal(struct thread *td, int stop_allowed) struct sigacts *ps; struct sigqueue *queue; sigset_t sigpending; + ksiginfo_t *psi; int sig, prop, newsig; =20 p =3D td->td_proc; @@ -2529,24 +2569,24 @@ issignal(struct thread *td, int stop_allowed) if (p->p_flag & P_TRACED && (p->p_flag & P_PPWAIT) =3D=3D 0) { /* * If traced, always stop. + * Remove old signal from queue before the stop. + * XXX shrug off debugger, it causes siginfo to + * be thrown away. */ + queue =3D &td->td_sigqueue; + psi =3D NULL; + if (sigqueue_steal(queue, sig, &psi) =3D=3D 0) { + queue =3D &p->p_sigqueue; + sigqueue_steal(queue, sig, &psi); + } + =09 mtx_unlock(&ps->ps_mtx); newsig =3D ptracestop(td, sig); mtx_lock(&ps->ps_mtx); =20 if (sig !=3D newsig) { - ksiginfo_t ksi; - - queue =3D &td->td_sigqueue; - /* - * clear old signal. - * XXX shrug off debugger, it causes siginfo to - * be thrown away. - */ - if (sigqueue_get(queue, sig, &ksi) =3D=3D 0) { - queue =3D &p->p_sigqueue; - sigqueue_get(queue, sig, &ksi); - } + if (psi !=3D NULL && ksiginfo_tryfree(psi)) + p->p_pendingcnt--; =20 /* * If parent wants us to take the signal, @@ -2561,10 +2601,14 @@ issignal(struct thread *td, int stop_allowed) * Put the new signal into td_sigqueue. If the * signal is being masked, look for other signals. */ - SIGADDSET(queue->sq_signals, sig); + sigqueue_add(queue, sig, NULL); if (SIGISMEMBER(td->td_sigmask, sig)) continue; signotify(td); + } else { + if (psi !=3D NULL) + psi->ksi_flags |=3D KSI_INS | KSI_HEAD; + sigqueue_add(&td->td_sigqueue, sig, psi); } =20 /* diff --git a/sys/sys/signalvar.h b/sys/sys/signalvar.h index c9455c2..c35fe69 100644 --- a/sys/sys/signalvar.h +++ b/sys/sys/signalvar.h @@ -234,6 +234,7 @@ typedef struct ksiginfo { #define KSI_EXT 0x02 /* Externally managed ksi. */ #define KSI_INS 0x04 /* Directly insert ksi, not the copy */ #define KSI_SIGQ 0x08 /* Generated by sigqueue, might ret EGAIN. */ +#define KSI_HEAD 0x10 /* Insert into head, not tail. */ #define KSI_COPYMASK (KSI_TRAP|KSI_SIGQ) =20 #define KSI_ONQ(ksi) ((ksi)->ksi_sigq !=3D NULL) --SR0DFWOzPg+ymaCV Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAktTaqIACgkQC3+MBN1Mb4g7pACgxtQ036HFhit7SrcTRTZ1mzef SiMAni95XltrO6uwEPVuxqaYmT7gilQr =DXYA -----END PGP SIGNATURE----- --SR0DFWOzPg+ymaCV--