From owner-freebsd-current@FreeBSD.ORG Tue Sep 9 00:44:26 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EE1710657A3 for ; Tue, 9 Sep 2008 00:44:26 +0000 (UTC) (envelope-from peter@wemm.org) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.251]) by mx1.freebsd.org (Postfix) with ESMTP id 592E58FC1A for ; Tue, 9 Sep 2008 00:44:26 +0000 (UTC) (envelope-from peter@wemm.org) Received: by an-out-0708.google.com with SMTP id b33so304794ana.13 for ; Mon, 08 Sep 2008 17:44:24 -0700 (PDT) Received: by 10.100.41.9 with SMTP id o9mr16531995ano.42.1220921064269; Mon, 08 Sep 2008 17:44:24 -0700 (PDT) Received: by 10.100.154.11 with HTTP; Mon, 8 Sep 2008 17:44:24 -0700 (PDT) Message-ID: Date: Mon, 8 Sep 2008 17:44:24 -0700 From: "Peter Wemm" To: "John Baldwin" In-Reply-To: <200809081556.02732.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200808230003.44081.jhb@freebsd.org> <200809021608.57542.jhb@freebsd.org> <200809081556.02732.jhb@freebsd.org> Cc: Benjamin.Close@clearchain.com, attilio@freebsd.org, freebsd-current@freebsd.org, kib@freebsd.org, kevinxlinuz@163.com Subject: Re: [BUG] I think sleepqueue need to be protected in sleepq_broadcast X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Sep 2008 00:44:26 -0000 On Mon, Sep 8, 2008 at 12:56 PM, John Baldwin wrote: > On Tuesday 02 September 2008 09:40:49 pm Peter Wemm wrote: [..] >> I don't know if it is the same problem, but mx2.freebsd.org, running >> today's 6.4-PRERELEASE just died with: >> Sep 3 00:20:11 mx2 sshd[15333]: fatal: Read from socket failed: Connection >> resr panic: Assertion td->td_flags & TDF_SINTR failed at >> ../../../kern/subr_sleepque5 cpuid = 2 >> KDB: enter: panic >> FreeBSD 6.4-PRERELEASE #7: Tue Sep 2 19:43:27 UTC 2008 >> This was after about 3 hours of uptime. It has previously run happily >> for months at a time before today's rebuild. > > So I think what happened is that the thread was woken up while the sleepq > chain was unlocked while the thread unlocks the sx lock. The code handles > this fine already since the same race can happen when dropping the lock while > checking for signals. However, in this case TDF_SINTR won't be true anymore. > The assertion just needs to be updated. Try this: > > Index: subr_sleepqueue.c > =================================================================== > --- subr_sleepqueue.c (revision 182874) > +++ subr_sleepqueue.c (working copy) > @@ -382,7 +382,7 @@ > CTR3(KTR_PROC, "sleepq catching signals: thread %p (pid %ld, %s)", > (void *)td, (long)p->p_pid, p->p_comm); > > - MPASS(td->td_flags & TDF_SINTR); > + MPASS((td->td_sleepqueue != NULL) ^ (td->td_flags & TDF_SINTR)); > mtx_unlock_spin(&sc->sc_lock); > > /* See if there are any pending signals for this thread. */ This is running on mx2 right now. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell