From owner-freebsd-arm@FreeBSD.ORG Tue Sep 9 16:27:37 2008 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 17A8E1065677; Tue, 9 Sep 2008 16:27:37 +0000 (UTC) (envelope-from tinguely@casselton.net) Received: from casselton.net (casselton.net [63.165.140.2]) by mx1.freebsd.org (Postfix) with ESMTP id 9E9CC8FC08; Tue, 9 Sep 2008 16:27:36 +0000 (UTC) (envelope-from tinguely@casselton.net) Received: from casselton.net (localhost [127.0.0.1]) by casselton.net (8.14.2/8.14.2) with ESMTP id m89GEAhO088267; Tue, 9 Sep 2008 11:14:10 -0500 (CDT) (envelope-from tinguely@casselton.net) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=casselton.net; s=ccnMail; t=1220976850; bh=37IKxo3eJS2RhYuTNlGYSMTKhsR6ZVSAoKRzRZC Du3w=; h=Date:From:Message-Id:To:Subject:Cc:In-Reply-To; b=ZKSEB7p1 4CgBwBCXYpGEVW34NPU4v2Gz10zF0VyPJrD2ZJ7bbXhjqXamkBqeQlsCJZ/zB2vrRpM oZ4t7M60OnH76oRtSAiFF+3Url2rCCVCRlXeM8zAukgBnSTuDc7+vCw6XDZ1pWI5Kcs gpiIknFjUudDFj6/K0IYU7pFPjMP4= Received: (from tinguely@localhost) by casselton.net (8.14.2/8.14.2/Submit) id m89GEAc4088266; Tue, 9 Sep 2008 11:14:10 -0500 (CDT) (envelope-from tinguely) Date: Tue, 9 Sep 2008 11:14:10 -0500 (CDT) From: Mark Tinguely Message-Id: <200809091614.m89GEAc4088266@casselton.net> To: jacques.fourie@gmail.com, sam@freebsd.org In-Reply-To: Cc: freebsd-arm@freebsd.org Subject: Re: Routing benchmarks X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Sep 2008 16:27:37 -0000 (trimmed the thread text) > >> Running netserver on the gumstix shows a throughput of 2.4Mbit/s. At > >> the moment I can't get if_bridge to work - will try to figure out what > >> is going on. A bridging benchmark may be more informative. > > My experience in working with architectures like this is that cache handling > > can be a significant cost that doesn't always show up on a profile. > > > > Thanks for the nice idea - will try something similar. At the moment > I'm also suspecting that cache handling has got a lot to do with the > performance figures that I'm seeing. The PXA255 has a 32KB data and > 32KB instruction cache. > > Jacques which version of freebsd are you using - we changed some cache flushing routines between FreeBSD 7.x and current. Unless errors were introduced or removed, there should not be that large of a change. As mentioned, the ARM caches are pretty small. The ARM processors before version 6, (anything before ARM10) uses virtually indexed / virtually tagged caches, so they need to be flushed on context changes. The version 6 and version 7 ARM processors (ARM10/ARM11) are either virtually indexed / physically tagged or physically indexed / physically tagged. The PIPT caches don't need to be flushed on context changes and were needed for multiple processor support. The pmap code will have to be re-written to take advantage of the PIPT caches (put a process value into the MMU and remove most flushes). Also, the pre version 6 ARM processors didn't allow for any spare bits in the PTE for OS use. The newer processors have a bit or two, still not enough for FreeBSD's needs, so we need to shadow these bits. Thirdly, we dynamically allocate a seperate structure that mirrors the page table. I think I have all the "paper scratching" required to move from this structure to the FreeBSD i386/amd64 recursive page table approach. --Mark Tinguely.