From owner-freebsd-amd64@FreeBSD.ORG  Thu Mar 15 11:55:47 2012
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 284B2106564A
	for <freebsd-amd64@freebsd.org>; Thu, 15 Mar 2012 11:55:47 +0000 (UTC)
	(envelope-from maho.nakata@gmail.com)
Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com
	[209.85.160.182])
	by mx1.freebsd.org (Postfix) with ESMTP id CD8EE8FC14
	for <freebsd-amd64@freebsd.org>; Thu, 15 Mar 2012 11:55:46 +0000 (UTC)
Received: by ghrr20 with SMTP id r20so3559945ghr.13
	for <freebsd-amd64@freebsd.org>; Thu, 15 Mar 2012 04:55:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=sender:date:message-id:to:cc:subject:from:in-reply-to:references
	:x-mailer:mime-version:content-type:content-transfer-encoding;
	bh=qIFi7p64GQUQVNdr11Ok52gmyE21hnqQTIeu28OnlME=;
	b=YwgbKOXWXewMw4yx68g6RMksRhqZZvYN6mnLEI2R3UudNU0j95XOarBiZXNzeF4hwk
	fyIGsH6x7HFKE53sLKyOcgt0uQHgGRQE9G+BkDhCn3ZNUP9nllR09kpJWejQ3wG/d+JI
	ztwfgqJGc7wsqPLurrGjkxDnCMWjq1B9FtZILGhKsqhbc1WJ5qBT3ETmP6XLjg4uuBGN
	T+tGZu69IuhzDJtj+Tuf/XwewsXOrY174Cg5FchTPICg0nB9cMx43UElVENlINIxVKBK
	SnAciENFHoUNM7KWHwPAXiQSJKjDtS7oEtj0WjIR0GSL6/CwHbJEOFJsBXgzTwaDUrDp
	0pxA==
Received: by 10.68.225.194 with SMTP id rm2mr4693355pbc.95.1331812546002;
	Thu, 15 Mar 2012 04:55:46 -0700 (PDT)
Received: from localhost (rikad42.riken.jp. [134.160.214.42])
	by mx.google.com with ESMTPS id p10sm1747441pbo.55.2012.03.15.04.55.43
	(version=SSLv3 cipher=OTHER); Thu, 15 Mar 2012 04:55:44 -0700 (PDT)
Sender: Maho NAKATA <maho.nakata@gmail.com>
Date: Thu, 15 Mar 2012 20:55:37 +0900 (JST)
Message-Id: <20120315.205537.1682271453232733525.chat95@mac.com>
To: tomdean@speakeasy.org
From: Maho NAKATA <chat95@mac.com>
In-Reply-To: <4F4DDCE7.9000008@speakeasy.org>
References: <4F4DA398.6070703@speakeasy.org>
	<20120229161408.G2514@besplex.bde.org>
	<4F4DDCE7.9000008@speakeasy.org>
X-Mailer: Mew version 6.3 on Emacs 23.4 / Mule 6.0 (HANACHIRUSATO)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-amd64@freebsd.org
Subject: Re: Gcc46 and 128 Bit Floating Point
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Mar 2012 11:55:47 -0000

Hi Thomas D. Dean

Why not using double-double approach?
double-double is poorman's quad math.
Using NVIDIA C2050, we can obtain 16GFlops to 26GFlops performance
for matrix-matrix multiplication.

I have been developing a linear algebra library.
http://mplapack.sourceforge.net/ 
.

Thanks
 Nakata Maho

From: "Thomas D. Dean" <tomdean@speakeasy.org>
Subject: Re: Gcc46 and 128 Bit Floating Point
Date: Wed, 29 Feb 2012 00:08:07 -0800

> On 02/28/12 22:03, Bruce Evans wrote:
> 
>>
>> But why would you want it? It is essentially unusable on sparc64,
>> since it is several thousand times slower than 80-bit floating point
>> on i386. At equal CPU clock speeds, it is only about 1000 times
>> slower.
>> Most of the factors of 10 are due to fundamental slowness of multi-
>> word artithmetic in software and the soft-float implementations not
>> being very good (I only tested with the old NetBSD/4.4BSD-derived one.
>> This has been replaced by the Hauser one, which has good chances for
>> being worse due to its greater generality and correctness, but the old
>> one has a lot of slop to improve). A modern x86 is much faster than
>> an old sparc64, giving about another factor of 10. 64-bit operations
>> are only about this 10 times slower (or more like 3 times slower at
>> equal CPU clock speeds) on an old sparc64 as on a not-so-modern core2
>> x86. The gnu libraries might be better. So you could hope for only
>> a factor of 100 slowdown on scalar code. But modern x86's can also
>> do vector code, and thus be up to 8 times faster for 32-bit floating
>> point with AVX. Really good multi-word libraries might be able to
>> exploit some vector operations, but I think multi-word operations are
>> too seial in nature to get much parallelism with them.
> 
> I have an application that takes 10 days to run on a 4.16GHz Core-i7
> 3930K.  No output until it finishes.
> 
> When I first started looking at this, I naively thought the 80-bit FPU
> floats were scaled to 128-bits.  Would be nice...
> 
> The application uses libgmp, but, about 1/2 to 2/3 of the work will
> fit in a 128-bit float.
> 
> I wanted to get 128-bit floating point operations so I could do 2/3
> the work in an FPU.  With 80-bits, I can only do 1/3 the work(+-).
> 
> Mostly, this is just "can I do it faster...".  Maybe some asm code to
> work the inner loops in FPU registers.  At some point, hand off to
> libgmp.  I now think the speed improvement would not be worth the
> work.
> 
> Tom Dean
> _______________________________________________
> freebsd-amd64@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-amd64
> To unsubscribe, send any mail to
> "freebsd-amd64-unsubscribe@freebsd.org"
>