From owner-freebsd-current@FreeBSD.ORG Thu Jul 8 17:17:06 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C511916A4CE for ; Thu, 8 Jul 2004 17:17:06 +0000 (GMT) Received: from mail4.speakeasy.net (mail4.speakeasy.net [216.254.0.204]) by mx1.FreeBSD.org (Postfix) with ESMTP id A623E43D1F for ; Thu, 8 Jul 2004 17:17:06 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 4465 invoked from network); 8 Jul 2004 17:17:06 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 8 Jul 2004 17:16:44 -0000 Received: from 10.50.41.229 (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i68HGdVV078364; Thu, 8 Jul 2004 13:16:39 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Thu, 8 Jul 2004 13:17:53 -0400 User-Agent: KMail/1.6 References: <20040705184940.GA2651@tybalt.greiner.local> <20040708222143.0f24c076.taku@tackymt.homeip.net> In-Reply-To: <20040708222143.0f24c076.taku@tackymt.homeip.net> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200407081317.53981.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Taku YAMAMOTO Subject: Re: Native preemption is the culprit [was Re: today's CURRENT lockups] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2004 17:17:06 -0000 On Thursday 08 July 2004 09:21 am, Taku YAMAMOTO wrote: > greetings, > > > A quick glance showed me that there are some interesting code paths in > sched_ule.c that can be problematic in SMP case. > > 1. sched_choose() => kseq_idled() => sched_add() > 2. sched_choose() => kseq_assign() => sched_add() > 3. sched_runnable() => kseq_assign() => sched_add() > > Here is the patch that re-enables preemption except for the above three > cases. This looks correct. I'll test it locally first. Has it worked for you all day? > --- sched_ule.c.orig Tue Jul 6 14:57:29 2004 > +++ sched_ule.c Thu Jul 8 06:37:30 2004 > @@ -286,6 +286,7 @@ > static void sched_balance_groups(void); > static void sched_balance_group(struct kseq_group *ksg); > static void sched_balance_pair(struct kseq *high, struct kseq *low); > +static void sched_add_internal(struct thread *td, int preemptive); > static void kseq_move(struct kseq *from, int cpu); > static int kseq_idled(struct kseq *kseq); > static void kseq_notify(struct kse *ke, int cpu); > @@ -616,7 +617,7 @@ > kseq_runq_rem(steal, ke); > kseq_load_rem(steal, ke); > ke->ke_cpu = PCPU_GET(cpuid); > - sched_add(ke->ke_thread); > + sched_add_internal(ke->ke_thread, 0); > return (0); > } > } > @@ -644,7 +645,7 @@ > for (; ke != NULL; ke = nke) { > nke = ke->ke_assign; > ke->ke_flags &= ~KEF_ASSIGNED; > - sched_add(ke->ke_thread); > + sched_add_internal(ke->ke_thread, 0); > } > } > > @@ -1542,6 +1543,14 @@ > void > sched_add(struct thread *td) > { > +#ifdef SMP > + sched_add_internal(td, 1); > +} > + > +static void > +sched_add_internal(struct thread *td, int preemptive) > +{ > +#endif /* SMP */ > struct kseq *kseq; > struct ksegrp *kg; > struct kse *ke; > @@ -1623,17 +1632,21 @@ > if (td->td_priority < curthread->td_priority) > curthread->td_flags |= TDF_NEEDRESCHED; > > -#if 0 > #ifdef SMP > /* > * Only try to preempt if the thread is unpinned or pinned to the > * current CPU. > + * XXX - avoid preemption if called from sched_ule.c internally. > + * there're a few code pathes that may be problematic: > + * sched_choose() => kseq_idled() => sched_add > + * sched_choose() => kseq_assign() => sched_add > + * sched_runnable() => kseq_assign() => sched_add > */ > - if (KSE_CAN_MIGRATE(ke, class) || ke->ke_cpu == PCPU_GET(cpuid)) > + if (preemptive && > + (KSE_CAN_MIGRATE(ke, class) || ke->ke_cpu == PCPU_GET(cpuid))) > #endif > if (maybe_preempt(td)) > return; > -#endif > ke->ke_ksegrp->kg_runq_kses++; > ke->ke_state = KES_ONRUNQ; > > > This patch is tested on P4@2.8GHz HTT-enabled machine. > > It has been running for several hours without a hang, although I have to > admit that this machine is too idle, far from being stressed. > > > On Tue, 6 Jul 2004 00:14:21 -0400 (EDT) > > Robert Watson wrote: > > (This time to more people) > > > > The patch below appears to (brute force) eliminate the crash/hang I'm > > experiencing with SCHED_ULE in the post-preemption universe. However, I > > was experiencing it only in the SMP case, not UP, so it could be I'm just > > not triggering it timing-wise. This would be a temporary fix until jhb > > is online again post-USENIX to take a look, assuming this works around > > the problem for people other than me. > > > > Note that this is probably damaging to interrupt processing latency. > > > > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > > robert@fledge.watson.org Principal Research Scientist, McAfee > > Research > > (snip) > > > _______________________________________________ > > freebsd-current@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to > > "freebsd-current-unsubscribe@freebsd.org" -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org