From owner-freebsd-questions@FreeBSD.ORG Sun Mar 27 10:33:39 2005 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F358B16A4CE for ; Sun, 27 Mar 2005 10:33:38 +0000 (GMT) Received: from smtp11.wanadoo.fr (smtp11.wanadoo.fr [193.252.22.31]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4EA7F43D1F for ; Sun, 27 Mar 2005 10:33:38 +0000 (GMT) (envelope-from atkielski.anthony@wanadoo.fr) Received: from me-wanadoo.net (localhost [127.0.0.1]) by mwinf1104.wanadoo.fr (SMTP Server) with ESMTP id CFBBA1C000A4 for ; Sun, 27 Mar 2005 12:33:36 +0200 (CEST) Received: from pix.atkielski.com (ASt-Lambert-111-2-1-3.w81-50.abo.wanadoo.fr [81.50.80.3]) by mwinf1104.wanadoo.fr (SMTP Server) with ESMTP id 905CB1C0009B for ; Sun, 27 Mar 2005 12:33:36 +0200 (CEST) X-ME-UUID: 20050327103336591.905CB1C0009B@mwinf1104.wanadoo.fr Date: Sun, 27 Mar 2005 12:33:36 +0200 From: Anthony Atkielski X-Priority: 3 (Normal) Message-ID: <14510304120.20050327123336@wanadoo.fr> To: freebsd-questions@freebsd.org In-Reply-To: <8C7007D5D4D30D2-A38-3B313@mblk-r33.sysops.aol.com> References: <1641928994.20050326192811@wanadoo.fr> <8C700529A2DFD74-A44-3A157@mblk-d34.sysops.aol.com> <439876144.20050326220638@wanadoo.fr> <8C7006AE7E80573-FAC-3B652@mblk-r28.sysops.aol.com> <49251524.20050326234521@wanadoo.fr> <8C7007D5D4D30D2-A38-3B313@mblk-r33.sysops.aol.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Re: hyper threading. X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: freebsd-questions@freebsd.org List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Mar 2005 10:33:39 -0000 em1897@aol.com writes: > You can argue the technical theory all you want, but the > measurements say otherwise. You have to ensure that you're doing the right measurements. >FreeBSD 4.9 ->> Load: 38% (I put this in for fun :-) > > Freebsd 5.4-Pre UP (no HT) -> Load: high 55-60% range > > FreeBSD 5.4-Pre SMP/HT -> Load: 70-80% (much more jumping around) You'll find that the total CPU time required from start to finish for a single thread is ALWAYS higher for SMP than for a UP environment, even if you have separate physical processors. Several things happen when you move from a uniprocessor environment to an environment with two or more processors: - The total CPU time for each thread increases. - The total system load on a per process basis increases. - The total throughput of the system improves if there is more than one independent process running in the system. - Each of the processors runs more slowly than it would if it were the only processor running in a UP environment. If you run a single-thread benchmark on a MP system, you'll find that it runs more slowly than it does on a UP system. If you run multiple single-thread independent benchmarks on a MP system, you'll find that total CPU time for each benchmark increases over that required in a UP system--but the elapsed time required to complete all benchmarks substantially diminishes. To properly gauge the performance of a multiprocessor system, you must run a realistic mix of tasks on the system and measure overall throughput. If you do this, you'll find that you always come out ahead with multiple processors, even HT processors. Hyperthreading is just a special case of multiprocessing that imposes some additional restrictions. HT is much more sensitive to similarities in instruction mix across processes, because the actual processor hardware is being shared. With a sufficiently heterogenous instruction mix across multiple execution threads, this isn't a problem; but if you are running a single-threaded benchmark, or a series of identical single-threaded benchmarks, it can seriously distort your measurements. Although adding physical processors diminishes the performance of each processor, it still adds overall processing power, up to a certain point. The increment is never equal to the actual number of processors added, though; that is, if you go from one to two processors, you never get a doubling of effective processor power--it's more like 70-80%. The percentage increment gets worse with each additional processor, until you reach a point at which performance actually starts to decline (the point at which this happens is extremely hardware dependent, but it's always well beyond two processors). Hyperthreaded processors should not diminish in performance just because HT is turned on, because the hardware contention that diminishes performance in conventional MP systems is largely absent in a HT microprocessor. However, since you are really still only sharing a single processor with HT, the overall increment is much lower than it would be with two physical processors, and it is very sensitive to the instruction mix. > this shows that you really are a bit foggy. Did you miss the part > where with 2 processors you actually do have 2 processors? I actually read what Intel had to say on how the architecture works, and I spent years measuring systems the hard way (with hardware monitors and probes), so I know somewhat whereof I speak. Multiprocessing was always a significant hot-button issue with customers, as they always wanted to know how much they really gained with multiple processors (as opposed to what they had been promised). > I can make an argument that networking with 1 processor on 5.4 is > better than with 2. For example, with a test similar to the above, with > 2 phyiscal processors FreeBSD 5.4 will start dropping packets way before > it hits 500Kpps unless you increase the interrrupts/second, which of > course increases the system load. And even with the dropped packets > (which should reduce the load because it doesnt have to receive > and transmit the packet), the load is still higher than for 4.x with > a single processor. Load is not a problem, as long as it's below 100%. Since individual processors slow down in MP configurations, anything that depends on raw processor speed will suffer in an MP configuration. However, overall system throughput is greatly enhanced by running with several processors. At the same time, the total processor time required to complete all tasks is greater in an MP environment than it would be in a UP environment--it's the fact that things can run in parallel that improves the throughput. Moral: if you want to avoid dropping packets in the situation you describe, increase the interrupt rate. The additional processing power of the system will make this practical. > You and many others regulary say things like "SMP is obviously faster", > or "Opterons are noticably faster", but those statements are only true > for certain applications. True, but those "certain applications" are the kind normally executed in real-world desktop and server systems. If this were not the case, multiprocessing systems would have been abandoned long ago. It's almost always better to have a single processor at 2 GFLOPS than it is to have two processors at 1 GFLOPS, but if you can't get 2 GFLOPS processors, having two 1 GFLOPS processors is the next best thing. -- Anthony