Date: Wed, 6 Apr 2011 13:21:04 -0400 From: Attilio Rao <attilio@freebsd.org> To: Ryan Stone <rysto32@gmail.com> Cc: freebsd-current@freebsd.org Subject: Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started Message-ID: <BANLkTimDrgGN_e9V8H7O%2BbqXTpYnKPrPZg@mail.gmail.com> In-Reply-To: <BANLkTinvisvGiQOs5w-nsxzRVbJUN5%2B5yQ@mail.gmail.com> References: <BANLkTinSyDaY-06N95n8c1NxOSdEnb5FkQ@mail.gmail.com> <201104060836.56542.jhb@freebsd.org> <BANLkTinvisvGiQOs5w-nsxzRVbJUN5%2B5yQ@mail.gmail.com>
index | next in thread | previous in thread | raw e-mail
2011/4/6 Ryan Stone <rysto32@gmail.com>:
> On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin <jhb@freebsd.org> wrote:
>> Hummm. Patching 4BSD to use the same route as ULE may be the best solution
>> for now if that is easiest. Alternatively, you could change 4BSD's
>> sched_add() to not try to kick other CPUs until smp_started is true.
>
> At first I thought that it was a consequence of the way it does CPU
> affinity, but now I see that it shortcuts if smp_started is not true.
> How about something like the following for 4BSD?
>
> --- sched_4bsd.c (revision 220222)
> +++ sched_4bsd.c (working copy)
> @@ -1242,14 +1242,14 @@
> }
> TD_SET_RUNQ(td);
>
> - if (td->td_pinned != 0) {
> + if (smp_started && td->td_pinned != 0) {
> cpu = td->td_lastcpu;
> ts->ts_runq = &runq_pcpu[cpu];
> single_cpu = 1;
> CTR3(KTR_RUNQ,
> "sched_add: Put td_sched:%p(td:%p) on cpu%d runq", ts, td,
> cpu);
> - } else if (td->td_flags & TDF_BOUND) {
> + } else if (smp_started && (td->td_flags & TDF_BOUND)) {
> /* Find CPU from bound runq. */
> KASSERT(SKE_RUNQ_PCPU(ts),
> ("sched_add: bound td_sched not on cpu runq"));
> @@ -1258,7 +1258,7 @@
> CTR3(KTR_RUNQ,
> "sched_add: Put td_sched:%p(td:%p) on cpu%d runq", ts, td,
> cpu);
> - } else if (ts->ts_flags & TSF_AFFINITY) {
> + } else if (smp_started && (ts->ts_flags & TSF_AFFINITY)) {
> /* Find a valid CPU for our cpuset */
> cpu = sched_pickcpu(td);
> ts->ts_runq = &runq_pcpu[cpu];
>
> The flow control is a bit awkward because of the multiple
> affinity/bound cpu cases. If somebody prefers the code to be
> structured differently I'd be open to suggestions.
That is more or less what ULE does -- in ULE it is simpler because it
goes via sched_pickcpu(), which still returns always CPU0 if APs still
didn't kick off.
I would also add a comment on top explaining the check, eventually,
but otherwise looks fine.
Attilio
--
Peace can only be achieved by understanding - A. Einstein
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTimDrgGN_e9V8H7O%2BbqXTpYnKPrPZg>
