Date: Sun, 20 Jul 2008 06:51:22 -0700 From: "Murty, Ravi" <ravi.murty@intel.com> To: "Kris Kennaway" <kris@FreeBSD.org>, <d@delphij.net> Cc: freebsd-hackers@freebsd.org Subject: RE: Bug in calcru in he 6.2 and 6.3 kernels Message-ID: <AEBCFC23C0E40949B10BA2C224FC61B007C221B9@orsmsx416.amr.corp.intel.com> In-Reply-To: <487284CA.4050407@FreeBSD.org> References: <AEBCFC23C0E40949B10BA2C224FC61B007A3253D@orsmsx416.amr.corp.intel.com> <48726193.1080807@FreeBSD.org> <48727E37.30700@delphij.net> <487284CA.4050407@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Has anyone identified the issue(s) that might be broken in the ULE scheduler in 6.2? I am running a rather simple test - creates 8 threads and runs it on an 8 CPU system (not a whole lot running on the system). When I run it with ULE, it runs slow, very slow sometimes - it's almost like the threads aren't picked to run. When I switch to 4BSD, things run fine. I was wondering if there is something I could look at? I realize it is broken, but I've added lots of stuff to the scheduler (for our project) which I'd have to migrate to ULE in 7.0. I'd like to figure out what might be going on in 6.2 before I spend the time to migrate to 7.0. Thanks Ravi -----Original Message----- From: Kris Kennaway [mailto:kris@FreeBSD.org]=20 Sent: Monday, July 07, 2008 2:04 PM To: d@delphij.net Cc: Murty, Ravi; freebsd-hackers@freebsd.org Subject: Re: Bug in calcru in he 6.2 and 6.3 kernels Xin LI wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 >=20 > Kris Kennaway wrote: > | Murty, Ravi wrote: > |> Hello everyone, > |> > |> > |> > |> Finally found what my last problem was. We were running top in a loop > |> and running some workloads that called sched_bind() to bind threads to > |> specific CPUs. The problem was that (and I am using ULE) sched_bind > |> calls a function to notify another CPU of a thread and then mi_switches > |> out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU > |> and given the thread is still running, calcru would come in and assert > |> the fact that "If I am running I better no be on NOCPU".. It appears > |> that in other parts of the kernel (e.g. forward_signal) this is > |> acceptable (i.e. it is okay to be running and oncpu is NOCPU). > |> > |> > |> Thanks > |> Ravi > | > | Don't use ULE in 6.x, it's broken and will not be fixed. >=20 > Perhaps we should mark it as broken using #error? After all the ULE > changes in 7.x is amazing and we do not want to have users to obtain bad > impressions from the 6.x versions... >=20 > I am not sure but some explicit warning message saying "ULE has been > revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please use > SCHED_4BSD or upgrade to 7.x." seems to be better than having them to > pursue the mailing list archive... I would agree with this; if you're happy running unstable and broken=20 scheduler code, you're surely able to update to 7.0 and run stable and=20 working scheduler code :) We should run it past re@ first since it's a change to a stable branch,=20 but it's experimental code so I don't see an issue. Kris
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AEBCFC23C0E40949B10BA2C224FC61B007C221B9>