From owner-freebsd-amd64@FreeBSD.ORG Tue Mar 14 23:20:41 2006 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9B27716A400 for ; Tue, 14 Mar 2006 23:20:41 +0000 (UTC) (envelope-from zombyfork@gmail.com) Received: from nproxy.gmail.com (nproxy.gmail.com [64.233.182.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id C320043D45 for ; Tue, 14 Mar 2006 23:20:40 +0000 (GMT) (envelope-from zombyfork@gmail.com) Received: by nproxy.gmail.com with SMTP id x4so1198040nfb for ; Tue, 14 Mar 2006 15:20:39 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:references; b=ODi1apVVNk7t/9F052LJ6C5jKt1aa18/NNuEB4as/Z2NePF2iSnPntbByvnKEEjgBzksG3x+6TwFwCCBSpK4taW/q/AhTtvome8ardkAyHbU9YlBdTVpmajrNyyCpZuLGRFZYvjq6p4MP8AEnE/40SoKyEproHBfQtgLxtC7mHU= Received: by 10.49.20.15 with SMTP id x15mr1625511nfi; Tue, 14 Mar 2006 15:20:35 -0800 (PST) Received: by 10.48.220.1 with HTTP; Tue, 14 Mar 2006 15:20:35 -0800 (PST) Message-ID: <346a80220603141520i2ac1a4br66cbfb213453dcd6@mail.gmail.com> Date: Tue, 14 Mar 2006 18:20:35 -0500 From: "Coleman Kane" To: JoaoBR In-Reply-To: <200603140740.38388.joao@matik.com.br> MIME-Version: 1.0 References: <20060313221836.5491916A420@hub.freebsd.org> <200603141106.13693.kono@kth.se> <200603140740.38388.joao@matik.com.br> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: kono@kth.se, freebsd-amd64@freebsd.org Subject: Re: amd64 slower than i386 on identical AMD 64 system? / How is hyperthreading handled on amd64? X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: cokane@cokane.org List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Mar 2006 23:20:41 -0000 On 3/14/06, JoaoBR wrote: > > On Tuesday 14 March 2006 07:06, Alexander Konovalenko wrote: > > > Hi > > > Since some time (>6.0R) I have the impression that amd64 runs slower > than > > > i386. Now I run some tests on identical hardware and using ubench > > > confirmes this. Somebody has comments on this? > > > > I have Dual core AMD64 4400+ and FreeBSD RELENG_5. I don't have FreeBSD > > i386 installed but you can just compare benchmarks. > > > > ubench uses all CPU/cores by default, when one ubench is running, top > > shows: > > > > so where is your comparism? My point was that the same hardware is faster > running i386 > > I experience this also on X2 machines but do not have two machines to > compare > I have a X2-4400-SMP running amd64 and a X2-4200-SMP running i386 and it > gives > me the same numbers running ubench > > > > Jo=E3o > > > > > > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU > > COMMAND 11528 XXXX 111 0 3572K 880K RUN 1 0:12 93.64% > > 42.29% ubench 11529 XXXX 111 0 3572K 880K CPU0 1 0:11 > > 97.21% 41.16% ubench 11526 XXXX -8 0 3572K 880K piperd 0 > > 0:17 41.76% 31.98% ubench > > > > > > one ubench executed (with no -s flag =3D use all CPU, default): > > > > Unix Benchmark Utility v.0.3 > > Copyright (C) July, 1999 PhysTech, Inc. > > Author: Sergei Viznyuk > > http://www.phystech.com/download/ubench.html > > FreeBSD 5.5-PRERELEASE FreeBSD 5.5-PRERELEASE #12: Sun Mar 5 17:34:07 > CET > > 2006 XXXX@XXXX:/usr/obj/usr/src/sys/DAEMON64SMP amd64 > > Ubench CPU: 238149 > > Ubench MEM: 255459 > > -------------------- > > Ubench AVG: 246804 > > > > > > two ubench executed with -s flag (use single CPU only): > > > > Ubench Single CPU: 120184 (0.40s) > > Ubench Single MEM: 126787 (0.39s) > > ----------------------------------- > > Ubench Single AVG: 123485 > > > > Ubench Single CPU: 121000 (0.41s) > > Ubench Single MEM: 128762 (0.40s) > > ----------------------------------- > > Ubench Single AVG: 124881 > > > > > > one ubench executed with -s flag (use single CPU only): > > > > Ubench Single CPU: 123251 (0.40s) > > Ubench Single MEM: 161494 (0.40s) > > ----------------------------------- > > Ubench Single AVG: 142372 > > > > > > /Alexander Konovalenko > > > > +46-8-5537-8142 (office) > > +46-7-3752-2116 > > http://daemon.nanophys.kth.se/~kono > > > > Royal Institute of Technology (KTH) > > Nanostructure Physics Department, Albanova > > Roslagstullsbacken 21 > > 10691 Stockholm > > Sweden > > > > > > > > > > > > > > A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada > segura. > Service fornecido pelo Datacenter Matik https://datacenter.matik.com.br > _______________________________________________ > freebsd-amd64@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-amd64 > To unsubscribe, send any mail to "freebsd-amd64-unsubscribe@freebsd.org" > I think that the nature of the ubench benchmark should be investigated to reveal the reasons behind your dismay. It seems to me that your assumption that 64-bit should be faster than 32-bit in all cases is wrong. The nature of the processor design, the OS implementation, and how ubench does its measurement needs to be addressed. First of all, when comparing a 64-bit amd64 to a 32-bit IA-32 system it is important to know that this *does not* in fact mean that if you tested a loop of: long x, y, z; x =3D 1; y =3D 1; z =3D x + y; That the 64-bit machine would do 2X that above calculation. In fact, on the 64-bit machine, the memory taken up by the x, y, z would be double that on the i386, the add/load instruction would also double in size, and as far as execution goes, the time *should* be about the same for both units. This is all looking like 64-bit would, by its nature, have a slower average than your 32-bit system. In addition, amd64 64-bit mode doubles your register set, increasing the amount of memory that needs to be moved around on a context switch, and everything is pointing towards.....probably slower. The benefit to a 64-bit architecture is the wider word length that opens up your options. If I want to compute a 64-bit calculation on my 64-bit box, I no longer nee= d to fumble around with EAX:EDX registere concatentations, etc... Also, when taking into account things like "BigInteger" code (that builds unlimited size integers by creating datastrings), that type of code should be bigger as well. Anyhow the Ubench pkg-descr says that it spawns a bunch of processes to do "senseless mathematical integer and floating-point calculations". The same is stated for the memory operations that it executes. In fact, the CPU bench part of it uses doubles and unsigned int types to do its work. Guess what?: sizeof(unsigned int) =3D=3D sizeof(unsigned int) on amd64 vs. i386 sizeof(double) =3D=3D sizeof(double) on amd64 vs. i386 Being 64-bit does not make the system be able to necessarily do two more unsigned int operations than it would in i386 mode. That's just not how its designed. So.... I still fail to see your point with this thread. Its not faster, because the way that ubench wants to do stuff is not expected to be faster. In fact it is slower, because it is doing more work, and you are seeing the results of that. -- Coleman Kane