Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Mar 2010 08:00:10 +0100
From:      Bernd Walter <ticso@cicely7.cicely.de>
To:        Maks Verver <maksverver@geocities.com>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Performance of SheevaPlug on 8-stable
Message-ID:  <20100307070010.GO58319@cicely7.cicely.de>
In-Reply-To: <4B9303E4.3090500@geocities.com>
References:  <4B92BD9D.6030709@geocities.com> <20100306211715.GK58319@cicely7.cicely.de> <20100306215153.GL58319@cicely7.cicely.de> <20100306.152603.716362616846278503.imp@bsdimp.com> <4B9303E4.3090500@geocities.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Mar 07, 2010 at 02:39:48AM +0100, Maks Verver wrote:
> On 03/06/2010 10:17 PM, Bernd Walter wrote:
> > Such massive speed difference sounds a bit like cache problems.
> 
> On 03/06/2010 11:26 PM, M. Warner Losh wrote:
> > Sounds a lot like ICACHE isn't being enabled, since a 3-liner like 
> > this should be executing entirely out of cache after the first 
> > instruction in main prefetches the cache line.
> 
> Thanks for the quick responses! I think the both of you are right. I
> didn't realize the cache could be turned off at all, but the boot output
> shows:
> 
>   CPU: Feroceon 88FR131 rev 1 (write-through core)
>     WB enabled EABT branch prediction enabled
>     16KB/32B 4-way Instruction cache
>     16KB/32B 4-way write-back-locking-C Data cache
> 
> This is different from the output on the wiki (which instructions I
> followed, to some extent) at http://wiki.freebsd.org/FreeBSDMarvell:
> 
> CPU: ARM926EJ-S rev 0 (ARM9EJ-S core)
>   DC enabled IC enabled WB enabled EABT branch prediction enabled
>   32KB/32B 1-way Instruction cache
>   32KB/32B 4-way write-back-locking-C Data cache
> 
> Note that this guy is not running a SheevaPlug; the CPU is different.

That's probably just because of different CPUs.
I see a similar output on all of my systems with ARM920T CPU and
still there is something wrong.

I just verified with my 7.0-current system:
[102]arm9# ./test 
200.000u 3.000s 12:51.47 26.3%  45+1512k 0+0io 0pf+0w
The system is productive and isn't completely idle, but the time is
still smaller, so it is hard to say if there is a problem as well.

Most interesting is a 8.0-current system I have:
[4]beaver.cicely.de> ./test 
196.000u 1.000s 3:43.03 88.8%   44+1452k 0+0io 0pf+0w
Still much slower than calculated 80 seconds though, but also much
faster than on my 9-current system.

> But it's clear enough that on my system both processor caches are
> disabled (even though they are correctly identified) and this is
> understandably catastrophic for performance. It's good to have that
> figured out at least. :-)

Your loop isn't doing any data access, so it's just saying something
about ICACHE not working.

> The logical next question is: why aren't these caches enabled? How is
> this supposed to work? Is the bootloader supposed to enable the cache,
> or the kernel? If the kernel, why isn't it doing this? (If it's the
> bootloader's task, then it's strange that the Linux kernel has no
> trouble enabling the cache with the same bootloader).

That's a good question.
The kernel identifies them as being enabled on my CPU, but is it
really true?
IS there something which disables it later or this code is already
wrong.
But maybe it is not ICACHE itself and the memory pages are just
declared uncacheable?

-- 
B.Walter <bernd@bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100307070010.GO58319>