From owner-freebsd-smp Wed May 3 18: 5:31 2000 Delivered-To: freebsd-smp@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id C744037B56E for ; Wed, 3 May 2000 18:05:24 -0700 (PDT) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.9.3/8.9.3) id SAA18594; Wed, 3 May 2000 18:05:03 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp02.primenet.com, id smtpdAAAyMaqrK; Wed May 3 18:04:54 2000 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id SAA01849; Wed, 3 May 2000 18:05:11 -0700 (MST) From: Terry Lambert Message-Id: <200005040105.SAA01849@usr01.primenet.com> Subject: Re: hlt instructions and temperature issues To: BHechinger@half.com (Brian Hechinger) Date: Thu, 4 May 2000 01:05:11 +0000 (GMT) Cc: tlambert@primenet.com ('Terry Lambert'), BHechinger@half.com (Brian Hechinger), dillon@apollo.backplane.com, jgowdy@home.com, smp@csn.net, jim@thehousleys.net, freebsd-smp@FreeBSD.ORG In-Reply-To: from "Brian Hechinger" at May 02, 2000 05:24:28 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > > This message is in MIME format. Since your mail reader does not understand > this format, some or all of this message may not be legible. > > ------_=_NextPart_001_01BFB47C.CC2E8B50 > Content-Type: text/plain; > charset="iso-8859-1" > > >> so there is no super-critical need for CPU idling. > > > >Not unless you have power or heat dissipation issues for a > >particular use case. For the vast majority of users, it's > >meaningless, unless they have philosophical instead of > >technical reasons (e.g. they are environmentalists, etc.). > > so, poorly design cases and tree huggers aside, we shouldn't see any > problems if this doesn't work. > > >> but very acceptable for the gains. > > > >If the gains are purely thermal, perhaps not. It does introduce > >an additional context switch latency, when leaving the scheduler, > >for the CPU that is running -- this means that it penalizes the > >IPI sending CPU to wake up the receiving CPU. But I think that > >if this is done correctly, this will be practically unmeasurable. > > i tried sending mail to the list, but the list doesn't like my mail server > all of a sudden, and i said (which applies here i believe): > > so if the giant spin-lock is broken down to CPU level spin-locks, would that > facilitate us doing this correctly? > > >> so those super-cryo CPU cooling units are hype. :) > > > >Not if they actually cool the CPU. > > but to what benefit. if a simple fan can do the job of keeping the CPU cool > enough, does over-cooling the CPU make things better, or does it not really > affect anything (besides cooling too much which is also bad from what i > understand) > > >> so no "real" usefulness for such a beast, only overly comlicated code? > > > >IMO, the utility is more in the ability to prepare the kernel > >for further change in the direction of per CPU run queues. This > >will require an interlock and IPI for process migration from > >CPU #1's run queue to CPU #N's run queue. > > so the HLT issue may belong to something bigger and it wouldn't hurt to look > into it. > > >The benefit to doing this is processor affinity for processes. > > > >Right now, if a CPU comes into the scheduler, the process at > >the top of the run queue gets the CPU. This results in cache > >busting. Consider that in an 8 processor system, there is only > >a 12.5% probability of getting the same CPU you were using last > >time, and thus there is an 87.5% probability of a cache bust. > > but this can be controlled because we can tell which CPU the process last > ran on? > > >People who don't know any better commonly claim that the SMP > >scalaing on shared memory multiprocessor architectures is > >limited to about 4 processors before you hit a wall of diminishing > >returns for additional processors. This is not true, unless you > >are doing a lot of interprocessor communication; this will not > >commonly happen in practice, unless you do the wrong things and > >let it happen through bad design. If all of your engines are > >work-to-do engines, then they are practically identical, except > >for cache contents, and there is little or no need to communicate > >between them. > > the usual if it's done right in the first place mantra. :) > > but doesn't SUN use shared memory MP arch? if so, they put as many as 64 > CPUs in a single box. i tend to believe that SUN knows what they are doing > considering how long they have been doing it. :) > > >For example, there's little reason that an HTTP server with 8 > >engines can not run one engine per processor, keep persistant > >connections (HTTP 1.1 spec.), and operate almost wholly > >independantly from one another. > > or something like SETI@home, which never has to talk to the other clients > (and in fact doesn't even know they exist) > > so again i see us falling back to if it's written right in the first > place..... > > cheers, > > -brian > ps: thanks for trying to get me to understand this. i appreciate it. > > ------_=_NextPart_001_01BFB47C.CC2E8B50 > Content-Type: text/html; > charset="iso-8859-1" > Content-Transfer-Encoding: quoted-printable > > > > > charset=3Diso-8859-1"> > 5.5.2651.75"> > RE: hlt instructions and temperature issues > > > >

>> so there is no super-critical need for CPU = > idling. >
> >
>Not unless you have power or heat dissipation = > issues for a >
>particular use case.  For the vast majority = > of users, it's >
>meaningless, unless they have philosophical = > instead of >
>technical reasons (e.g. they are = > environmentalists, etc.). >

> >

so, poorly design cases and tree huggers aside, we = > shouldn't see any problems if this doesn't work. >

> >

>> but very acceptable for the gains. >
> >
>If the gains are purely thermal, perhaps = > not.  It does introduce >
>an additional context switch latency, when = > leaving the scheduler, >
>for the CPU that is running -- this means that = > it penalizes the >
>IPI sending CPU to wake up the receiving = > CPU.  But I think that >
>if this is done correctly, this will be = > practically unmeasurable. >

> >

i tried sending mail to the list, but the list = > doesn't like my mail server all of a sudden, and i said (which applies = > here i believe):

> >

so if the giant spin-lock is broken down to CPU level = > spin-locks, would that facilitate us doing this correctly? >

> >

>> so those super-cryo CPU cooling units are = > hype. :) >
> >
>Not if they actually cool the CPU. >

> >

but to what benefit.  if a simple fan can do the = > job of keeping the CPU cool enough, does over-cooling the CPU make = > things better, or does it not really affect anything (besides cooling = > too much which is also bad from what i understand)

> >

>> so no "real" usefulness for such a = > beast, only overly comlicated code? >
> >
>IMO, the utility is more in the ability to = > prepare the kernel >
>for further change in the direction of per CPU = > run queues.  This >
>will require an interlock and IPI for process = > migration from >
>CPU #1's run queue to CPU #N's run queue. >

> >

so the HLT issue may belong to something bigger and = > it wouldn't hurt to look into it. >

> >

>The benefit to doing this is processor affinity = > for processes. >
> >
>Right now, if a CPU comes into the scheduler, = > the process at >
>the top of the run queue gets the CPU.  = > This results in cache >
>busting.  Consider that in an 8 processor = > system, there is only >
>a 12.5% probability of getting the same CPU you = > were using last >
>time, and thus there is an 87.5% probability of = > a cache bust. >

> >

but this can be controlled because we can tell which = > CPU the process last ran on? >

> >

>People who don't know any better commonly claim = > that the SMP >
>scalaing on shared memory multiprocessor = > architectures is >
>limited to about 4 processors before you hit a = > wall of diminishing >
>returns for additional processors.  This is = > not true, unless you >
>are doing a lot of interprocessor communication; = > this will not >
>commonly happen in practice, unless you do the = > wrong things and >
>let it happen through bad design.  If all = > of your engines are >
>work-to-do engines, then they are practically = > identical, except >
>for cache contents, and there is little or no = > need to communicate >
>between them. >

> >

the usual if it's done right in the first place = > mantra. :) >

> >

but doesn't SUN use shared memory MP arch?  if = > so, they put as many as 64 CPUs in a single box.  i tend to = > believe that SUN knows what they are doing considering how long they = > have been doing it.  :)

> >

>For example, there's little reason that an HTTP = > server with 8 >
>engines can not run one engine per processor, = > keep persistant >
>connections (HTTP 1.1 spec.), and operate almost = > wholly >
>independantly from one another. >

> >

or something like SETI@home, which never has to talk = > to the other clients (and in fact doesn't even know they exist) >

> >

so again i see us falling back to if it's written = > right in the first place..... >

> >

cheers, >

> >

-brian >
ps: thanks for trying to get me to understand = > this.  i appreciate it. >

> > > > ------_=_NextPart_001_01BFB47C.CC2E8B50-- > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message