Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Jul 2008 06:51:22 -0700
From:      "Murty, Ravi" <ravi.murty@intel.com>
To:        "Kris Kennaway" <kris@FreeBSD.org>, <d@delphij.net>
Cc:        freebsd-hackers@freebsd.org
Subject:   RE: Bug in calcru in he 6.2 and 6.3 kernels
Message-ID:  <AEBCFC23C0E40949B10BA2C224FC61B007C221B9@orsmsx416.amr.corp.intel.com>
In-Reply-To: <487284CA.4050407@FreeBSD.org>
References:  <AEBCFC23C0E40949B10BA2C224FC61B007A3253D@orsmsx416.amr.corp.intel.com> <48726193.1080807@FreeBSD.org> <48727E37.30700@delphij.net> <487284CA.4050407@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Has anyone identified the issue(s) that might be broken in the ULE
scheduler in 6.2? I am running a rather simple test - creates 8 threads
and runs it on an 8 CPU system (not a whole lot running on the system).
When I run it with ULE, it runs slow, very slow sometimes - it's almost
like the threads aren't picked to run. When I switch to 4BSD, things run
fine. I was wondering if there is something I could look at? I realize
it is broken, but I've added lots of stuff to the scheduler (for our
project) which I'd have to migrate to ULE in 7.0. I'd like to figure out
what might be going on in 6.2 before I spend the time to migrate to 7.0.

Thanks
Ravi


-----Original Message-----
From: Kris Kennaway [mailto:kris@FreeBSD.org]=20
Sent: Monday, July 07, 2008 2:04 PM
To: d@delphij.net
Cc: Murty, Ravi; freebsd-hackers@freebsd.org
Subject: Re: Bug in calcru in he 6.2 and 6.3 kernels

Xin LI wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>=20
> Kris Kennaway wrote:
> | Murty, Ravi wrote:
> |> Hello everyone,
> |>
> |>
> |>
> |> Finally found what my last problem was. We were running top in a
loop
> |> and running some workloads that called sched_bind() to bind threads
to
> |> specific CPUs. The problem was that (and I am using ULE) sched_bind
> |> calls a function to notify another CPU of a thread and then
mi_switches
> |> out of it. Since mi_switch sets the "oncpu" field of the thread to
NOCPU
> |> and given the thread is still running, calcru would come in and
assert
> |> the fact that "If I am running I better no be on NOCPU".. It
appears
> |> that in other parts of the kernel (e.g. forward_signal) this is
> |> acceptable (i.e. it is okay to be running and oncpu is NOCPU).
> |>
> |>
> |> Thanks
> |> Ravi
> |
> | Don't use ULE in 6.x, it's broken and will not be fixed.
>=20
> Perhaps we should mark it as broken using #error?  After all the ULE
> changes in 7.x is amazing and we do not want to have users to obtain
bad
> impressions from the 6.x versions...
>=20
> I am not sure but some explicit warning message saying "ULE has been
> revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please
use
> SCHED_4BSD or upgrade to 7.x." seems to be better than having them to
> pursue the mailing list archive...

I would agree with this; if you're happy running unstable and broken=20
scheduler code, you're surely able to update to 7.0 and run stable and=20
working scheduler code :)

We should run it past re@ first since it's a change to a stable branch,=20
but it's experimental code so I don't see an issue.

Kris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AEBCFC23C0E40949B10BA2C224FC61B007C221B9>