Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Sep 2008 11:14:10 -0500 (CDT)
From:      Mark Tinguely <tinguely@casselton.net>
To:        jacques.fourie@gmail.com, sam@freebsd.org
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Routing benchmarks
Message-ID:  <200809091614.m89GEAc4088266@casselton.net>
In-Reply-To: <be2f52430809090816v57c2c80u6a48446b1e875361@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

(trimmed the thread text)

>  >> Running netserver on the gumstix shows a throughput of 2.4Mbit/s. At
>  >> the moment I can't get if_bridge to work - will try to figure out what
>  >> is going on. A bridging benchmark may be more informative.

>  > My experience in working with architectures like this is that cache handling
>  > can be a significant cost that doesn't always show up on a profile.
>  >
>
>  Thanks for the nice idea - will try something similar. At the moment
>  I'm also suspecting that cache handling has got a lot to do with the
>  performance figures that I'm seeing. The PXA255 has a 32KB data and
>  32KB instruction cache.
>
>  Jacques

which version of freebsd are you using - we changed some cache flushing
routines between FreeBSD 7.x and current. Unless errors were introduced
or removed, there should not be that large of a change.

As mentioned, the ARM caches are pretty small. The ARM processors before
version 6, (anything before ARM10) uses virtually indexed / virtually tagged
caches, so they need to be flushed on context changes.

The version 6 and version 7 ARM processors (ARM10/ARM11) are either virtually
indexed / physically tagged or physically indexed / physically tagged.
The PIPT caches don't need to be flushed on context changes and were needed
for multiple processor support. The pmap code will have to be re-written to
take advantage of the PIPT caches (put a process value into the MMU and remove
most flushes).

Also, the pre version 6 ARM processors didn't allow for any spare bits in the
PTE for OS use. The newer processors have a bit or two, still not enough
for FreeBSD's needs, so we need to shadow these bits. 

Thirdly, we dynamically allocate a seperate structure that mirrors the
page table. I think I have all the "paper scratching" required to move from
this structure to the FreeBSD i386/amd64 recursive page table approach.

--Mark Tinguely.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200809091614.m89GEAc4088266>