From owner-freebsd-current Thu Mar 1 20:44:46 2001 Delivered-To: freebsd-current@freebsd.org Received: from moby.geekhouse.net (moby.geekhouse.net [64.81.6.36]) by hub.freebsd.org (Postfix) with ESMTP id BC2BA37B71E for ; Thu, 1 Mar 2001 20:44:24 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: from laptop.baldwin.cx (john@dhcp152.geekhouse.net [192.168.1.152]) by moby.geekhouse.net (8.11.0/8.9.3) with ESMTP id f224iw154632; Thu, 1 Mar 2001 20:44:58 -0800 (PST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20010302023548.4F733BACC@cr66388-a.rchrd1.on.wave.home.com> Date: Thu, 01 Mar 2001 20:44:05 -0800 (PST) From: John Baldwin To: Jake Burkholder Subject: Re: Scheduler panic Cc: current@FreeBSD.org, Kris Kennaway Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On 02-Mar-01 Jake Burkholder wrote: >> > On Sun, Feb 25, 2001 at 10:29:42PM -0800, Kris Kennaway wrote: >> > > This is on a UP system. >> > >> > Had another one of these, under the same conditions. Both times I was >> > running more(1) on a stdin stream which was generated by a "find | >> > grep | more" operation, and I suspended the process with ^Z, >> > triggering the panic. Perhaps this will help in tracking down the >> > root cause. >> >> I'm pretty sure I know what this is; I'll work up a patch tonight. >> > > Sorry this is taking so long. Its turned out to be a little more > complex to fix properly than I originally thought. We're going to > have to change the way one of the fields of struct proc (p_pptr) > is locked. The problem is that a process is getting preempted > when its not SRUN, which should be protected by the scheduler > lock so that the preemption can't occur. > > This is the best workaround I can think of: > > Index: kern/kern_intr.c > =================================================================== > RCS file: /home/ncvs/src/sys/kern/kern_intr.c,v > retrieving revision 1.47 > diff -u -r1.47 kern_intr.c > --- kern/kern_intr.c 2001/02/28 02:53:43 1.47 > +++ kern/kern_intr.c 2001/03/02 02:28:08 > @@ -366,7 +366,7 @@ > */ > ithread->it_need = 1; > mtx_lock_spin(&sched_lock); > - if (p->p_stat == SWAIT) { > + if (p->p_stat == SWAIT && curproc->p_stat == SRUN) { > CTR1(KTR_INTR, __func__ ": setrunqueue %d", p->p_pid); > p->p_stat = SRUN; > setrunqueue(p); > > Jake Eek, this is wrong. We need to always put it on the runqueue, the trick is we just need to avoid the actual task switch. This is what I have here: @@ -369,7 +374,7 @@ CTR1(KTR_INTR, __func__ ": setrunqueue %d", p->p_pid); p->p_stat = SRUN; setrunqueue(p); - if (do_switch) { + if (do_switch && curproc->p_stat == SRUN) { saveintr = sched_lock.mtx_saveintr; mtx_intr_enable(&sched_lock); if (curproc != PCPU_GET(idleproc)) (Among other fixes.) I'll try and get this committed tonight if no one screams bloody murder. -- John Baldwin -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message