From owner-freebsd-hackers@FreeBSD.ORG Sun Jul 20 13:51:25 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D6AB106564A for ; Sun, 20 Jul 2008 13:51:25 +0000 (UTC) (envelope-from ravi.murty@intel.com) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by mx1.freebsd.org (Postfix) with ESMTP id 113278FC14 for ; Sun, 20 Jul 2008 13:51:24 +0000 (UTC) (envelope-from ravi.murty@intel.com) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 20 Jul 2008 06:50:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.31,218,1215414000"; d="scan'208";a="419751115" Received: from orsmsx335.amr.corp.intel.com (HELO orsmsx335.jf.intel.com) ([10.22.226.40]) by orsmga001.jf.intel.com with ESMTP; 20 Jul 2008 06:50:57 -0700 Received: from orsmsx416.amr.corp.intel.com ([10.22.226.46]) by orsmsx335.jf.intel.com with Microsoft SMTPSVC(6.0.3790.1830); Sun, 20 Jul 2008 06:51:24 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Sun, 20 Jul 2008 06:51:22 -0700 Message-ID: In-Reply-To: <487284CA.4050407@FreeBSD.org> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Bug in calcru in he 6.2 and 6.3 kernels Thread-Index: AcjgdRP1JaXY2BXEQHCMkoelzVa5yQJ+a2GQ References: <48726193.1080807@FreeBSD.org> <48727E37.30700@delphij.net> <487284CA.4050407@FreeBSD.org> From: "Murty, Ravi" To: "Kris Kennaway" , X-OriginalArrivalTime: 20 Jul 2008 13:51:24.0043 (UTC) FILETIME=[B2B365B0:01C8EA6F] Cc: freebsd-hackers@freebsd.org Subject: RE: Bug in calcru in he 6.2 and 6.3 kernels X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jul 2008 13:51:25 -0000 Has anyone identified the issue(s) that might be broken in the ULE scheduler in 6.2? I am running a rather simple test - creates 8 threads and runs it on an 8 CPU system (not a whole lot running on the system). When I run it with ULE, it runs slow, very slow sometimes - it's almost like the threads aren't picked to run. When I switch to 4BSD, things run fine. I was wondering if there is something I could look at? I realize it is broken, but I've added lots of stuff to the scheduler (for our project) which I'd have to migrate to ULE in 7.0. I'd like to figure out what might be going on in 6.2 before I spend the time to migrate to 7.0. Thanks Ravi -----Original Message----- From: Kris Kennaway [mailto:kris@FreeBSD.org]=20 Sent: Monday, July 07, 2008 2:04 PM To: d@delphij.net Cc: Murty, Ravi; freebsd-hackers@freebsd.org Subject: Re: Bug in calcru in he 6.2 and 6.3 kernels Xin LI wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 >=20 > Kris Kennaway wrote: > | Murty, Ravi wrote: > |> Hello everyone, > |> > |> > |> > |> Finally found what my last problem was. We were running top in a loop > |> and running some workloads that called sched_bind() to bind threads to > |> specific CPUs. The problem was that (and I am using ULE) sched_bind > |> calls a function to notify another CPU of a thread and then mi_switches > |> out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU > |> and given the thread is still running, calcru would come in and assert > |> the fact that "If I am running I better no be on NOCPU".. It appears > |> that in other parts of the kernel (e.g. forward_signal) this is > |> acceptable (i.e. it is okay to be running and oncpu is NOCPU). > |> > |> > |> Thanks > |> Ravi > | > | Don't use ULE in 6.x, it's broken and will not be fixed. >=20 > Perhaps we should mark it as broken using #error? After all the ULE > changes in 7.x is amazing and we do not want to have users to obtain bad > impressions from the 6.x versions... >=20 > I am not sure but some explicit warning message saying "ULE has been > revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please use > SCHED_4BSD or upgrade to 7.x." seems to be better than having them to > pursue the mailing list archive... I would agree with this; if you're happy running unstable and broken=20 scheduler code, you're surely able to update to 7.0 and run stable and=20 working scheduler code :) We should run it past re@ first since it's a change to a stable branch,=20 but it's experimental code so I don't see an issue. Kris