From owner-freebsd-current@FreeBSD.ORG Thu Jul 8 13:21:48 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9F67316A4CE for ; Thu, 8 Jul 2004 13:21:48 +0000 (GMT) Received: from basalt.tackymt.homeip.net (YahooBB219181148048.bbtec.net [219.181.148.48]) by mx1.FreeBSD.org (Postfix) with ESMTP id B02DD43D49 for ; Thu, 8 Jul 2004 13:21:47 +0000 (GMT) (envelope-from taku@tackymt.homeip.net) Received: from localhost (localhost [127.0.0.1]) by basalt.tackymt.homeip.net (Postfix) with ESMTP id 8C2081075D for ; Thu, 8 Jul 2004 22:21:46 +0900 (JST) Received: from maestro.tackymt.homeip.net (unknown [IPv6:2001:3e0:577:0:240:26ff:fe49:1c9d]) by basalt.tackymt.homeip.net (Postfix) with ESMTP for ; Thu, 8 Jul 2004 22:21:46 +0900 (JST) Date: Thu, 8 Jul 2004 22:21:43 +0900 From: Taku YAMAMOTO To: freebsd-current@freebsd.org Message-Id: <20040708222143.0f24c076.taku@tackymt.homeip.net> In-Reply-To: References: <20040705184940.GA2651@tybalt.greiner.local> X-Mailer: Sylpheed version 0.9.12-gtk2-20040622 (GTK+ 2.2.4; i386-portbld-freebsd5.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at tackymt.homeip.net Subject: Re: Native preemption is the culprit [was Re: today's CURRENT lockups] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2004 13:21:48 -0000 greetings, A quick glance showed me that there are some interesting code paths in sched_ule.c that can be problematic in SMP case. 1. sched_choose() => kseq_idled() => sched_add() 2. sched_choose() => kseq_assign() => sched_add() 3. sched_runnable() => kseq_assign() => sched_add() Here is the patch that re-enables preemption except for the above three cases. --- sched_ule.c.orig Tue Jul 6 14:57:29 2004 +++ sched_ule.c Thu Jul 8 06:37:30 2004 @@ -286,6 +286,7 @@ static void sched_balance_groups(void); static void sched_balance_group(struct kseq_group *ksg); static void sched_balance_pair(struct kseq *high, struct kseq *low); +static void sched_add_internal(struct thread *td, int preemptive); static void kseq_move(struct kseq *from, int cpu); static int kseq_idled(struct kseq *kseq); static void kseq_notify(struct kse *ke, int cpu); @@ -616,7 +617,7 @@ kseq_runq_rem(steal, ke); kseq_load_rem(steal, ke); ke->ke_cpu = PCPU_GET(cpuid); - sched_add(ke->ke_thread); + sched_add_internal(ke->ke_thread, 0); return (0); } } @@ -644,7 +645,7 @@ for (; ke != NULL; ke = nke) { nke = ke->ke_assign; ke->ke_flags &= ~KEF_ASSIGNED; - sched_add(ke->ke_thread); + sched_add_internal(ke->ke_thread, 0); } } @@ -1542,6 +1543,14 @@ void sched_add(struct thread *td) { +#ifdef SMP + sched_add_internal(td, 1); +} + +static void +sched_add_internal(struct thread *td, int preemptive) +{ +#endif /* SMP */ struct kseq *kseq; struct ksegrp *kg; struct kse *ke; @@ -1623,17 +1632,21 @@ if (td->td_priority < curthread->td_priority) curthread->td_flags |= TDF_NEEDRESCHED; -#if 0 #ifdef SMP /* * Only try to preempt if the thread is unpinned or pinned to the * current CPU. + * XXX - avoid preemption if called from sched_ule.c internally. + * there're a few code pathes that may be problematic: + * sched_choose() => kseq_idled() => sched_add + * sched_choose() => kseq_assign() => sched_add + * sched_runnable() => kseq_assign() => sched_add */ - if (KSE_CAN_MIGRATE(ke, class) || ke->ke_cpu == PCPU_GET(cpuid)) + if (preemptive && + (KSE_CAN_MIGRATE(ke, class) || ke->ke_cpu == PCPU_GET(cpuid))) #endif if (maybe_preempt(td)) return; -#endif ke->ke_ksegrp->kg_runq_kses++; ke->ke_state = KES_ONRUNQ; This patch is tested on P4@2.8GHz HTT-enabled machine. It has been running for several hours without a hang, although I have to admit that this machine is too idle, far from being stressed. On Tue, 6 Jul 2004 00:14:21 -0400 (EDT) Robert Watson wrote: > > (This time to more people) > > The patch below appears to (brute force) eliminate the crash/hang I'm > experiencing with SCHED_ULE in the post-preemption universe. However, I > was experiencing it only in the SMP case, not UP, so it could be I'm just > not triggering it timing-wise. This would be a temporary fix until jhb is > online again post-USENIX to take a look, assuming this works around the > problem for people other than me. > > Note that this is probably damaging to interrupt processing latency. > > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > robert@fledge.watson.org Principal Research Scientist, McAfee Research (snip) > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" -- -|-__ YAMAMOTO, Taku | __ < Post Scriptum to the people who know me as taku@cent.saitama-u.ac.jp: My email address has been changed since April, because I've left the university.