Date: Wed, 10 Mar 2010 14:58:00 +0100 From: Grzegorz Bernacki <gjb@semihalf.com> To: Mark Tinguely <tinguely@casselton.net> Cc: freebsd-arm@freebsd.org Subject: Re: Performance of SheevaPlug on 8-stable Message-ID: <4B97A568.5080101@semihalf.com> In-Reply-To: <201003072125.o27LPfFb000968@casselton.net> References: <201003072125.o27LPfFb000968@casselton.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Mark Tinguely wrote: > FreeBSD-current has kernel and user witness turned on. Witness is for > locks, so it should not change the performance of a tight arithmetic loop > like this. > > I don't know the marvell interals, and from what I tell, their technial > docs require NDA. That said, many of the ARM processors also have a > instruction internal cache (instruction prefetch) in addition to the > instruction cache. I don't think the prefetch has an enable/disable. > > It looks like from the cpu identification that the the branch prediction > is turned on. Branch prediction compensates for the longer pipelines. > I can't see how in the tight loop how that could go astray. > > Thus says the ARM ARM: > > ARM implementations are free to choose how far ahead of the > current point of execution they prefetch instructions; either > a fixed or a dynamically varying number of instructions. As well > as being free to choose how many instructions to prefetch, an ARM > implementation can choose which possible future execution path to > prefetch along. For example, after a branch instruction, it can > choose to prefetch either the instruction following the branch > or the instruction at the branch target. This is known as branch > prediction. > > There are a few data dangling allocations that I would like to see > closed from the multiple kernel allocation fix. *IN THEORY, IF* a page > is allocated via the arm_nocache (DMA COHERENT) or a sendfile, then > it is never marked as unallocated. *IN THEORY*, if that page is used > again, then we could falsely believe that page is being shared and > we turn off the cache, eventhough it is not shared. > > http://www.casselton.net/~tinguely/arm_pmap_unmanaged.diff > > * Disclaimer: I am not sure if DMA COHERENT nor sendfiles are used in > the Sheeva implementation. This is a theoritical observation of a side > effect of the multiple kernel mapping patch that we did just before > FreeBSD 8-release. I instrumented code with KTRs and your theory is correct. Kernel reuse page which was previouly mapped via arm_nocache. Your patch should be applied to -current. grzesiek
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B97A568.5080101>