From owner-cvs-all@FreeBSD.ORG Mon Jan 29 20:20:33 2007 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 18AD716A401; Mon, 29 Jan 2007 20:20:33 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 1730713C4B4; Mon, 29 Jan 2007 20:20:32 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 9CDD9487FE; Mon, 29 Jan 2007 21:20:30 +0100 (CET) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id E637045CD9; Mon, 29 Jan 2007 21:20:23 +0100 (CET) Date: Mon, 29 Jan 2007 21:19:38 +0100 From: Pawel Jakub Dawidek To: arch@FreeBSD.org Message-ID: <20070129201938.GF87767@garage.freebsd.pl> References: <20070128202917.5B67916A5A6@hub.freebsd.org> <45BD82D2.20301@root.org> <20070129175222.GA87767@garage.freebsd.pl> <45BE37DC.6080509@root.org> <20070129184522.GD87767@garage.freebsd.pl> <45BE46B7.8000406@samsco.org> <20070129193205.GE87767@garage.freebsd.pl> <20070129194158.N32458@fledge.watson.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="dWYAkE0V1FpFQHQ3" Content-Disposition: inline In-Reply-To: <20070129194158.N32458@fledge.watson.org> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: cvs-src@FreeBSD.org, Scott Long , src-committers@FreeBSD.org, cvs-all@FreeBSD.org, Nate Lawson Subject: Re: cvs commit: src/sys/geom/eli g_eli.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Jan 2007 20:20:33 -0000 --dWYAkE0V1FpFQHQ3 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jan 29, 2007 at 07:52:20PM +0000, Robert Watson wrote: >=20 > On Mon, 29 Jan 2007, Pawel Jakub Dawidek wrote: >=20 > >>Why? You're proposing yet another intrusive change to the kernel to ha= ndle yet another one-off requirement of your code. Why not do what I sugge= sted before with hooking=20 > >>the appropriate SYSINIT in your module? Or why not follow Robert's sugg= estion and implement a simple event mechanism so that any module can know w= hen a CPU has come=20 > >>online or offline. Heck, you probably don't even need to implement a n= ew mechanism, just hook the existing EVENTHANLER mechanism. That's what it= 's designed for!! > > > >I'm afraid Scott that your proposals are hacks. As a GEOM class I should= not use SYSINIT, EVENTHANDLER, etc. I shouldn't bother if CPUs are online = or not. All events I=20 > >need to implement a GEOM class I should receive from the infrastructure.= Also I shouldn't be called by the infrastructure when the system is not ye= t ready for my activity,=20 > >that's why I proposed to implement this functionality in the infrastruct= ure (ie. delay GEOM tasting machanism), that hack SYSINITs in every single = GEOM class that need to=20 > >bind to a CPU. >=20 > I guess I'm not sure I entirely agree. I think that we lack some importa= nt infrastructure, which we've been talking on and off for a dev summit or = two now, for handling=20 > the arrival and departure of CPU resources ("dynamic reconfiguration"). = While once this wasn't really an issue on PC hardware, it now is, with the = advent of hypervisors,=20 > virtualization, not to mention more multiprocessing, etc. We have quite = a few algorithms and data structures that assume that the set of CPUs is st= atic, and fail quite=20 > badly (i.e., memory leaks, work lost, etc) if a CPU were to stop scheduli= ng threads. Geli is not alone in wanting to know what and when CPUs are av= ailable for concurrent=20 > work, and like other pieces of code (UMA is the piece I have particular f= amiliarity with), finds our infrastructure lacking. I'm also not entirely = convinced I agree with=20 > you as by the same token that you might claim sysinits and event handlers= shouldn't be used by GEOM modules, perhaps kthreads should also not be use= d :-). Sysinits,=20 > eventhandlers, and kthreads are all ways for scheduling and dispatching w= ork. The infrastructure is also there to help, simplify the code and allow to avoid code duplications. I see no reason to start GEOM classes when the system is simply not ready. So instead of using yet another KPI in every GEOM class that would like to bind to CPU, I suggested to remove the code from geli and instruct GEOM to do it for all classes in one go. > So perhaps we need to start having the conversation about CPU events more= seriously now. What do you think of the idea of the following: two event = handlers, a CPU start=20 > event and a CPU stop event, which are guaranteed to run on each CPU as as= the CPU comes online, and just before the CPU goes offline. Kernel subsyst= ems could use these=20 > events to determine when CPU resources were arriving and departing in som= e serious sense (not just "busy") in order to initialize and tear down per-= CPU data structures,=20 > rebalance workloads, start or stop per-CPU works, etc. The example I hav= e in mind here is the network stack, which might reasonably wish to have pe= r-CPU netisr (worker)=20 > threads. When the set of CPUs changes, it would like to increase or decre= ase the number of workers -- having the same number of workers compressed d= own to a smaller number=20 > of CPUs by migration would be a disaster for performance. I fully agree that there should be a clean KPI for this. What you proposed if fine. Because of lack of such KPI geli has to handle HTT CPUs which are turned off by default in releases also by abusing scheduler internals. KPI you proposed would allow me to remove those hacks. And I'm really all for it. What you and Scott are missing is that when I implement a GEOM class, I'm using what is available to do my work. I'm not going to educate myself how schedulers work, implement nice and clean KPI to use it in my class. I'm not saying it wouldn't be great to be able to do so, but I don't have time for everything, unfortunately, and you guys should understand that very well. I had conversation with John (jhb@) on IRC when I asked him how can I skip CPUs that are turned off. He then mentioned that it should be handled by KPI you're proposing, but also mentioned that I should go with the solution I've now, because at this point there is nothing better than that. Anyway, I'd love to remove current hacks and use what you proposed, Robert. > How to handle the boot processor is an interesting question -- are we int= erested in configuring away the boot processor at run-time? If not, we pro= bably want to handle it=20 > as a special case via sysinit. If all CPUs are equal and any may go away= , then we might need to rework our notion of shutdown, and provide these sa= me events for the boot=20 > CPU (which does sound desirable so as not to end up with lots of special = casing in subsystems). Regardless, we are hardly the first OS to try to add= ress these issues via a=20 > clean architectural solution, and my thinking is we should do a bit of re= search. A first place to look would definitely be OpenSolaris. I'd prefer boot CPU to not be treated in any special way. If it has to be, it can be hidden from the subsystems, ie. by sending CPU-online event for the boot CPU, but never sending CPU-offline event or something like this. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --dWYAkE0V1FpFQHQ3 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFFvlbaForvXbEpPzQRAp59AJwKUHqvYLo3o1+vdi85Ebv56B5F9QCgoGxX DwdHSiKwxUWBXOz9JZumOuw= =yN+t -----END PGP SIGNATURE----- --dWYAkE0V1FpFQHQ3--