From owner-freebsd-arm@FreeBSD.ORG Wed Mar 10 13:58:13 2010 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3BC011065672 for ; Wed, 10 Mar 2010 13:58:13 +0000 (UTC) (envelope-from gjb@semihalf.com) Received: from smtp.semihalf.com (smtp.semihalf.com [213.17.239.109]) by mx1.freebsd.org (Postfix) with ESMTP id 5E8318FC1C for ; Wed, 10 Mar 2010 13:58:10 +0000 (UTC) Received: from localhost (unknown [213.17.239.109]) by smtp.semihalf.com (Postfix) with ESMTP id A5FF2C4273; Wed, 10 Mar 2010 15:00:32 +0100 (CET) X-Virus-Scanned: by amavisd-new at semihalf.com Received: from smtp.semihalf.com ([213.17.239.109]) by localhost (smtp.semihalf.com [213.17.239.109]) (amavisd-new, port 10024) with ESMTP id u2bTmCETqDTh; Wed, 10 Mar 2010 15:00:32 +0100 (CET) Received: from [10.0.0.75] (cardhu.semihalf.com [213.17.239.108]) by smtp.semihalf.com (Postfix) with ESMTPA id 00422C41E7; Wed, 10 Mar 2010 15:00:31 +0100 (CET) Message-ID: <4B97A568.5080101@semihalf.com> Date: Wed, 10 Mar 2010 14:58:00 +0100 From: Grzegorz Bernacki User-Agent: Thunderbird 2.0.0.16 (X11/20090618) MIME-Version: 1.0 To: Mark Tinguely References: <201003072125.o27LPfFb000968@casselton.net> In-Reply-To: <201003072125.o27LPfFb000968@casselton.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-arm@freebsd.org Subject: Re: Performance of SheevaPlug on 8-stable X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Mar 2010 13:58:13 -0000 Mark Tinguely wrote: > FreeBSD-current has kernel and user witness turned on. Witness is for > locks, so it should not change the performance of a tight arithmetic loop > like this. > > I don't know the marvell interals, and from what I tell, their technial > docs require NDA. That said, many of the ARM processors also have a > instruction internal cache (instruction prefetch) in addition to the > instruction cache. I don't think the prefetch has an enable/disable. > > It looks like from the cpu identification that the the branch prediction > is turned on. Branch prediction compensates for the longer pipelines. > I can't see how in the tight loop how that could go astray. > > Thus says the ARM ARM: > > ARM implementations are free to choose how far ahead of the > current point of execution they prefetch instructions; either > a fixed or a dynamically varying number of instructions. As well > as being free to choose how many instructions to prefetch, an ARM > implementation can choose which possible future execution path to > prefetch along. For example, after a branch instruction, it can > choose to prefetch either the instruction following the branch > or the instruction at the branch target. This is known as branch > prediction. > > There are a few data dangling allocations that I would like to see > closed from the multiple kernel allocation fix. *IN THEORY, IF* a page > is allocated via the arm_nocache (DMA COHERENT) or a sendfile, then > it is never marked as unallocated. *IN THEORY*, if that page is used > again, then we could falsely believe that page is being shared and > we turn off the cache, eventhough it is not shared. > > http://www.casselton.net/~tinguely/arm_pmap_unmanaged.diff > > * Disclaimer: I am not sure if DMA COHERENT nor sendfiles are used in > the Sheeva implementation. This is a theoritical observation of a side > effect of the multiple kernel mapping patch that we did just before > FreeBSD 8-release. I instrumented code with KTRs and your theory is correct. Kernel reuse page which was previouly mapped via arm_nocache. Your patch should be applied to -current. grzesiek