From owner-freebsd-arm@FreeBSD.ORG Tue Oct 7 08:44:38 2014 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1307CEB8 for ; Tue, 7 Oct 2014 08:44:38 +0000 (UTC) Received: from nibbler.fubar.geek.nz (nibbler.fubar.geek.nz [199.48.134.198]) by mx1.freebsd.org (Postfix) with ESMTP id EA5D7CB9 for ; Tue, 7 Oct 2014 08:44:37 +0000 (UTC) Received: from bender.lan (97e07ab1.skybroadband.com [151.224.122.177]) by nibbler.fubar.geek.nz (Postfix) with ESMTPSA id 792EA5C0B9; Tue, 7 Oct 2014 08:44:36 +0000 (UTC) Date: Tue, 7 Oct 2014 09:44:31 +0100 From: Andrew Turner To: John-Mark Gurney Subject: Re: [RFC] Add and armv7hf TARGET_ARCH Message-ID: <20141007094431.09600b56@bender.lan> In-Reply-To: <20141007042430.GH1852@funkthat.com> References: <20141006134626.59cc5573@bender.lan> <20141006173045.GE1852@funkthat.com> <20141006224124.494267e0@bender.lan> <20141007042430.GH1852@funkthat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-arm@freebsd.org X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Oct 2014 08:44:38 -0000 On Mon, 6 Oct 2014 21:24:30 -0700 John-Mark Gurney wrote: > Andrew Turner wrote this message on Mon, Oct 06, 2014 at 22:41 +0100: > > On Mon, 6 Oct 2014 10:30:45 -0700 > > John-Mark Gurney wrote: > > > > > Andrew Turner wrote this message on Mon, Oct 06, 2014 at 13:46 > > > +0100: > > > > I'm interested in peoples opinion on creating a new TARGET_ARCH > > > > to target ARMv7 SoCs. This will target all the current Cortex-A > > > > chips we support but not the Raspberry Pi. My intention with > > > > this is to have it become the tier 1 arm platform. > > > > > > > > This platform will support 32-bit Cortex-A based SoCs with a VFP > > > > unit. As it would be targeting ARMv7 we could look at supporting > > > > Thumb-2. > > > > > > > > As the VFP unit is optional and future SoCs without it will > > > > only be supported by the armv6 TARGET_ARCH, however I would > > > > expect almost all ARMv7 designs to include it. > > > > > > So, what are the specific pros of having a new arch? I see you > > > talk about Thumb-2 support, but are there other advantages? Will > > > we get significant performance boosts? What? > > > > We would get a significant speed improvement for anything that uses > > floating-point. I haven't done extensive tests, but Ian was getting > > around 30x-34x improvement by using the vfp on one benchmark [1]. > > I've seen a sight improvement of around 3-5 MFlops on his numbers > > on my board. > > > > I expect there to be a slight performance improvement from being > > able to use the newer ARMv7 instructions, however this will be less > > pronounced than the above floating-point improvement. > > > > There are also a number of NEON optimised libc functions we could > > make use of, for example [2]. While we may be able to use them on > > armv6 it becomes simpler if we can assume we have a NEON unit. > > Don't we already have armv6hf for hardware float? What is the > difference between armv6hf and armv7hf? or is this 30x-34x > improvement over armv6hf? My plan is to replace the armv6hf with armv7hf. The difference between the two is, on armv7hf we can assume newer floating-point instructions including the NEON SIMD instructions. The performance improvement above was changing from the soft to softfp ABI. Softfp allows the compiler to generate vfp code, but will pass floating-point data between functions in the integer registers. It has been reported on some cores moving data between the vfp and arm registers can cause both to stall for at least 20 cycles [1]. While armv7hf doesn't remove the need to move between groups of registers completely it will reduce the need due to the calling convention. Andrew [1] http://pandorawiki.org/Floating_Point_Optimization