From owner-freebsd-smp  Wed May  3 18: 5:31 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132])
	by hub.freebsd.org (Postfix) with ESMTP id C744037B56E
	for <freebsd-smp@FreeBSD.ORG>; Wed,  3 May 2000 18:05:24 -0700 (PDT)
	(envelope-from tlambert@usr01.primenet.com)
Received: (from daemon@localhost)
	by smtp02.primenet.com (8.9.3/8.9.3) id SAA18594;
	Wed, 3 May 2000 18:05:03 -0700 (MST)
Received: from usr01.primenet.com(206.165.6.201)
 via SMTP by smtp02.primenet.com, id smtpdAAAyMaqrK; Wed May  3 18:04:54 2000
Received: (from tlambert@localhost)
	by usr01.primenet.com (8.8.5/8.8.5) id SAA01849;
	Wed, 3 May 2000 18:05:11 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <200005040105.SAA01849@usr01.primenet.com>
Subject: Re: hlt instructions and temperature issues
To: BHechinger@half.com (Brian Hechinger)
Date: Thu, 4 May 2000 01:05:11 +0000 (GMT)
Cc: tlambert@primenet.com ('Terry Lambert'),
	BHechinger@half.com (Brian Hechinger), dillon@apollo.backplane.com,
	jgowdy@home.com, smp@csn.net, jim@thehousleys.net,
	freebsd-smp@FreeBSD.ORG
In-Reply-To: <F997095BF6F8D3119E540090276AE53015D623@exchange01.half.com> from "Brian Hechinger" at May 02, 2000 05:24:28 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> 
> This message is in MIME format. Since your mail reader does not understand
> this format, some or all of this message may not be legible.
> 
> ------_=_NextPart_001_01BFB47C.CC2E8B50
> Content-Type: text/plain;
> 	charset="iso-8859-1"
> 
> >> so there is no super-critical need for CPU idling.
> >
> >Not unless you have power or heat dissipation issues for a
> >particular use case.  For the vast majority of users, it's
> >meaningless, unless they have philosophical instead of
> >technical reasons (e.g. they are environmentalists, etc.).
> 
> so, poorly design cases and tree huggers aside, we shouldn't see any
> problems if this doesn't work.
> 
> >> but very acceptable for the gains.
> >
> >If the gains are purely thermal, perhaps not.  It does introduce
> >an additional context switch latency, when leaving the scheduler,
> >for the CPU that is running -- this means that it penalizes the
> >IPI sending CPU to wake up the receiving CPU.  But I think that
> >if this is done correctly, this will be practically unmeasurable.
> 
> i tried sending mail to the list, but the list doesn't like my mail server
> all of a sudden, and i said (which applies here i believe):
> 
> so if the giant spin-lock is broken down to CPU level spin-locks, would that
> facilitate us doing this correctly?
> 
> >> so those super-cryo CPU cooling units are hype. :)
> >
> >Not if they actually cool the CPU.
> 
> but to what benefit.  if a simple fan can do the job of keeping the CPU cool
> enough, does over-cooling the CPU make things better, or does it not really
> affect anything (besides cooling too much which is also bad from what i
> understand)
> 
> >> so no "real" usefulness for such a beast, only overly comlicated code?
> >
> >IMO, the utility is more in the ability to prepare the kernel
> >for further change in the direction of per CPU run queues.  This
> >will require an interlock and IPI for process migration from
> >CPU #1's run queue to CPU #N's run queue.
> 
> so the HLT issue may belong to something bigger and it wouldn't hurt to look
> into it.
> 
> >The benefit to doing this is processor affinity for processes.
> >
> >Right now, if a CPU comes into the scheduler, the process at
> >the top of the run queue gets the CPU.  This results in cache
> >busting.  Consider that in an 8 processor system, there is only
> >a 12.5% probability of getting the same CPU you were using last
> >time, and thus there is an 87.5% probability of a cache bust.
> 
> but this can be controlled because we can tell which CPU the process last
> ran on?
> 
> >People who don't know any better commonly claim that the SMP
> >scalaing on shared memory multiprocessor architectures is
> >limited to about 4 processors before you hit a wall of diminishing
> >returns for additional processors.  This is not true, unless you
> >are doing a lot of interprocessor communication; this will not
> >commonly happen in practice, unless you do the wrong things and
> >let it happen through bad design.  If all of your engines are
> >work-to-do engines, then they are practically identical, except
> >for cache contents, and there is little or no need to communicate
> >between them.
> 
> the usual if it's done right in the first place mantra. :)
> 
> but doesn't SUN use shared memory MP arch?  if so, they put as many as 64
> CPUs in a single box.  i tend to believe that SUN knows what they are doing
> considering how long they have been doing it.  :)
> 
> >For example, there's little reason that an HTTP server with 8
> >engines can not run one engine per processor, keep persistant
> >connections (HTTP 1.1 spec.), and operate almost wholly
> >independantly from one another.
> 
> or something like SETI@home, which never has to talk to the other clients
> (and in fact doesn't even know they exist)
> 
> so again i see us falling back to if it's written right in the first
> place.....
> 
> cheers,
> 
> -brian
> ps: thanks for trying to get me to understand this.  i appreciate it.
> 
> ------_=_NextPart_001_01BFB47C.CC2E8B50
> Content-Type: text/html;
> 	charset="iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
> 
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
> <HTML>
> <HEAD>
> <META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
> charset=3Diso-8859-1">
> <META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
> 5.5.2651.75">
> <TITLE>RE: hlt instructions and temperature issues</TITLE>
> </HEAD>
> <BODY>
> 
> <P><FONT SIZE=3D2>&gt;&gt; so there is no super-critical need for CPU =
> idling.</FONT>
> <BR><FONT SIZE=3D2>&gt;</FONT>
> <BR><FONT SIZE=3D2>&gt;Not unless you have power or heat dissipation =
> issues for a</FONT>
> <BR><FONT SIZE=3D2>&gt;particular use case.&nbsp; For the vast majority =
> of users, it's</FONT>
> <BR><FONT SIZE=3D2>&gt;meaningless, unless they have philosophical =
> instead of</FONT>
> <BR><FONT SIZE=3D2>&gt;technical reasons (e.g. they are =
> environmentalists, etc.).</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>so, poorly design cases and tree huggers aside, we =
> shouldn't see any problems if this doesn't work.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>&gt;&gt; but very acceptable for the gains.</FONT>
> <BR><FONT SIZE=3D2>&gt;</FONT>
> <BR><FONT SIZE=3D2>&gt;If the gains are purely thermal, perhaps =
> not.&nbsp; It does introduce</FONT>
> <BR><FONT SIZE=3D2>&gt;an additional context switch latency, when =
> leaving the scheduler,</FONT>
> <BR><FONT SIZE=3D2>&gt;for the CPU that is running -- this means that =
> it penalizes the</FONT>
> <BR><FONT SIZE=3D2>&gt;IPI sending CPU to wake up the receiving =
> CPU.&nbsp; But I think that</FONT>
> <BR><FONT SIZE=3D2>&gt;if this is done correctly, this will be =
> practically unmeasurable.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>i tried sending mail to the list, but the list =
> doesn't like my mail server all of a sudden, and i said (which applies =
> here i believe):</FONT></P>
> 
> <P><FONT SIZE=3D2>so if the giant spin-lock is broken down to CPU level =
> spin-locks, would that facilitate us doing this correctly?</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>&gt;&gt; so those super-cryo CPU cooling units are =
> hype. :)</FONT>
> <BR><FONT SIZE=3D2>&gt;</FONT>
> <BR><FONT SIZE=3D2>&gt;Not if they actually cool the CPU.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>but to what benefit.&nbsp; if a simple fan can do the =
> job of keeping the CPU cool enough, does over-cooling the CPU make =
> things better, or does it not really affect anything (besides cooling =
> too much which is also bad from what i understand)</FONT></P>
> 
> <P><FONT SIZE=3D2>&gt;&gt; so no &quot;real&quot; usefulness for such a =
> beast, only overly comlicated code?</FONT>
> <BR><FONT SIZE=3D2>&gt;</FONT>
> <BR><FONT SIZE=3D2>&gt;IMO, the utility is more in the ability to =
> prepare the kernel</FONT>
> <BR><FONT SIZE=3D2>&gt;for further change in the direction of per CPU =
> run queues.&nbsp; This</FONT>
> <BR><FONT SIZE=3D2>&gt;will require an interlock and IPI for process =
> migration from</FONT>
> <BR><FONT SIZE=3D2>&gt;CPU #1's run queue to CPU #N's run queue.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>so the HLT issue may belong to something bigger and =
> it wouldn't hurt to look into it.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>&gt;The benefit to doing this is processor affinity =
> for processes.</FONT>
> <BR><FONT SIZE=3D2>&gt;</FONT>
> <BR><FONT SIZE=3D2>&gt;Right now, if a CPU comes into the scheduler, =
> the process at</FONT>
> <BR><FONT SIZE=3D2>&gt;the top of the run queue gets the CPU.&nbsp; =
> This results in cache</FONT>
> <BR><FONT SIZE=3D2>&gt;busting.&nbsp; Consider that in an 8 processor =
> system, there is only</FONT>
> <BR><FONT SIZE=3D2>&gt;a 12.5% probability of getting the same CPU you =
> were using last</FONT>
> <BR><FONT SIZE=3D2>&gt;time, and thus there is an 87.5% probability of =
> a cache bust.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>but this can be controlled because we can tell which =
> CPU the process last ran on?</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>&gt;People who don't know any better commonly claim =
> that the SMP</FONT>
> <BR><FONT SIZE=3D2>&gt;scalaing on shared memory multiprocessor =
> architectures is</FONT>
> <BR><FONT SIZE=3D2>&gt;limited to about 4 processors before you hit a =
> wall of diminishing</FONT>
> <BR><FONT SIZE=3D2>&gt;returns for additional processors.&nbsp; This is =
> not true, unless you</FONT>
> <BR><FONT SIZE=3D2>&gt;are doing a lot of interprocessor communication; =
> this will not</FONT>
> <BR><FONT SIZE=3D2>&gt;commonly happen in practice, unless you do the =
> wrong things and</FONT>
> <BR><FONT SIZE=3D2>&gt;let it happen through bad design.&nbsp; If all =
> of your engines are</FONT>
> <BR><FONT SIZE=3D2>&gt;work-to-do engines, then they are practically =
> identical, except</FONT>
> <BR><FONT SIZE=3D2>&gt;for cache contents, and there is little or no =
> need to communicate</FONT>
> <BR><FONT SIZE=3D2>&gt;between them.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>the usual if it's done right in the first place =
> mantra. :)</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>but doesn't SUN use shared memory MP arch?&nbsp; if =
> so, they put as many as 64 CPUs in a single box.&nbsp; i tend to =
> believe that SUN knows what they are doing considering how long they =
> have been doing it.&nbsp; :)</FONT></P>
> 
> <P><FONT SIZE=3D2>&gt;For example, there's little reason that an HTTP =
> server with 8</FONT>
> <BR><FONT SIZE=3D2>&gt;engines can not run one engine per processor, =
> keep persistant</FONT>
> <BR><FONT SIZE=3D2>&gt;connections (HTTP 1.1 spec.), and operate almost =
> wholly</FONT>
> <BR><FONT SIZE=3D2>&gt;independantly from one another.</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>or something like SETI@home, which never has to talk =
> to the other clients (and in fact doesn't even know they exist)</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>so again i see us falling back to if it's written =
> right in the first place.....</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>cheers,</FONT>
> </P>
> 
> <P><FONT SIZE=3D2>-brian</FONT>
> <BR><FONT SIZE=3D2>ps: thanks for trying to get me to understand =
> this.&nbsp; i appreciate it.</FONT>
> </P>
> 
> </BODY>
> </HTML>
> ------_=_NextPart_001_01BFB47C.CC2E8B50--
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message