Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Jan 2012 13:49:22 -0700
From:      Ian Lepore <freebsd@damnhippie.dyndns.org>
To:        Rafal Jaworowski <raj@semihalf.com>
Cc:        freebsd-arm@FreeBSD.ORG
Subject:   Re: Performance of SheevaPlug on 8-stable
Message-ID:  <1328042962.1662.398.camel@revolution.hippie.lan>
In-Reply-To: <C7DD05DA-AE55-4E8C-8EA2-391E6E05067A@semihalf.com>
References:  <1327980703.1662.240.camel@revolution.hippie.lan> <F48E21E0-129A-418A-B147-7D5FB01160A8@bsdimp.com> <1328025245.1662.289.camel@revolution.hippie.lan> <5FB4965A-66C9-4C99-8B61-5AC605F9ECC5@bsdimp.com> <1328030999.1662.324.camel@revolution.hippie.lan> <C7DD05DA-AE55-4E8C-8EA2-391E6E05067A@semihalf.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2012-01-31 at 21:20 +0100, Rafal Jaworowski wrote:
> On 2012-01-31, at 18:29, Ian Lepore wrote:
> 
> > On Tue, 2012-01-31 at 09:37 -0700, Warner Losh wrote:
> >> On Jan 31, 2012, at 8:54 AM, Ian Lepore wrote:
> >> 
> >>> On Mon, 2012-01-30 at 22:39 -0700, Warner Losh wrote:
> >>>> Hi Ian,
> >>>> 
> >>>> Do you have any data on what 9.0 does?
> >>>> 
> >>>> Warner
> >>> 
> >>> No.  Do you have reason to believe it will be different than 8.x?
> >>> 
> >>> It would be a major effort right now to get anything later than 8.2
> >>> built and running on one of our arm platforms.  Maybe not as hard as the
> >>> 6.2 -> 8.2 conversion was, but we're still carrying a lot of diffs from
> >>> stock FreeBSD that have to be analyzed and merged by hand.  Actually
> >>> before that can even happen I'd have to grab a snapshot of 9.0 and do an
> >>> svn->Hg conversion to even be able to start merging the diffs (and I'm
> >>> hardly an Hg expert, but those in the company who are let me know last
> >>> week that they're just as busy as me, and I'm on my own for this kind of
> >>> work).  It's work I want to do, but I suspect it's going to happen later
> >>> rather than sooner because product deadlines are beginning to loom and
> >>> my ability to spend most of my time working on the OS side of things is
> >>> waning.
> >>> 
> >>> If there are some specific changes you've got in mind that affect this
> >>> problem I might be able to backport and test them faster than I could
> >>> get a full 9.0 or -current build environment working, just point me at
> >>> them.
> >> 
> >> I thought that we'd done a root cause of this and had put a fix into the vm system.  Lemme look...
> >> 
> >> ------------------------------------------------------------------------
> >> r224049 | marcel | 2011-07-14 20:11:26 -0600 (Thu, 14 Jul 2011) | 2 lines
> >> 
> >> In pmap_protect(), don't call vm_page_dirty() if the page is unmanaged.
> >> 
> >> 
> >> ------------------------------------------------------------------------
> >> r221844 | cognet | 2011-05-13 09:54:12 -0600 (Fri, 13 May 2011) | 4 lines
> >> 
> >> In pmap_change_wiring(), use the right argument for pmap_modify_pv().
> >> It only worked because the only consumer calls pmap_change_wiring() to remove
> >> the wiring.
> >> 
> >> ------------------------------------------------------------------------
> >> r212507 | cognet | 2010-09-12 14:46:32 -0600 (Sun, 12 Sep 2010) | 5 lines
> >> 
> >> In pmap_remove_all(), do not decrease pm_stats.wired_count if the mapping was
> >> wired, as it's been done later in pmap_nuke_pv().
> >> 
> >> Submitted by:   Mark Tinguely
> >> 
> >> 
> >> ------------------------------------------------------------------------
> >> r209223 | cognet | 2010-06-15 16:16:02 -0600 (Tue, 15 Jun 2010) | 4 lines
> >> 
> >> Turn off cache if there's more than one kernel mapping, and one is writable.
> >> 
> >> Submitted by:   Mark Tinguely
> >> 
> >> ------------------------------------------------------------------------
> >> r205028 | raj | 2010-03-11 14:16:54 -0700 (Thu, 11 Mar 2010) | 12 lines
> >> 
> >> Fix ARM cache handling yet more.
> >> 
> >> 1) vm_machdep.c: remove the dangling allocations so they do not
> >>   un-necessarily turn off the cache upon consecutive access.
> >> 
> >> 2) busdma_machdep.c: remove the same amount than shadow mapped.
> >> 
> >> Reported by:    Maks Verver
> >> Submitted by:   Mark Tinguely
> >> Reviewed by:    Grzegorz Bernacki
> >> MFC after:      3 days
> >> 
> >> ------------------------------------------------------------------------
> >> r203637 | raj | 2010-02-07 13:48:57 -0700 (Sun, 07 Feb 2010) | 19 lines
> >> 
> >> Improve checking whether an ARM VA has a valid mapping before performing cache
> >> sync.
> >> 
> >> VIPT/PIPT caches need valid VA-PA mapping in PTE for a cache operation to
> >> succeed (unlike VIVT). Prior to this fix pmap was using l2pte_valid() for that
> >> check, but this is not sufficient as the function merely checks if a PTE
> >> exists (there can be existing but _invalid_ entries in the table).
> >> 
> >> A new pmap_has_valid_mapping() routine is introduced to do this job right by
> >> checking proper PTE flags.
> >> 
> >> Among other potential problems this cures coherency issues with L2 caches on
> >> MV-78100.
> >> 
> >> Submitted by:   Grzegorz Bernacki, Piotr Ziecik
> >> Reviewed, tested by:    marcel
> >> Obtained from:  Semihalf
> >> MFC after:      1 week
> >> 
> >> 
> >> Only the last two have MFC, so you can start there and see which of these changes are in...
> >> 
> >> Just thought you might have a reference board that would be easy to test...
> >> 
> >> Warner
> > 
> > I think we may have all those changes incorporated except perhaps
> > r224049; I'll make sure of that.  
> > 
> > r209223 is the change that exposed this situation.  
> > 
> > I'm skeptical that any of the changes you cite (or any change at all in
> > the pmap layer) will fix the problem, because the problem seems to be
> > rooted in the fact that the vfs buffer cache establishes a kva mapping
> > of the buffer pages with the protections set to READ|WRITE|EXEC and
> > leaves that mapping in place as long as the buffer is in the cache, and
> > r209223 says that as long as there are multiple mappings of a page with
> > at least one writable, that page's i-cache and d-cache bits stay off.
> > (The multiple mappings being the one for the buffer cache that includes
> > write access and one or more READ|EXEC mappings made by pmap() when the
> > executable or library is loaded/relocated.)
> > 
> > If my analysis is correct (and I'm fairly sure, if not 100% positive,
> > that it is), then it seems to me that the only fix available is going to
> > be at the vfs layer, and it's going to involve dropping the write access
> > to the pages in the buffer cache once any physical IO and/or uio
> > operations needing write access are completed.  
> > 
> > Even if I could figure out a patchset to fix the problem, it's going to
> > need a lot of input from the vm gurus to answer questions such as what
> > the performance impact will be to non-VIVT platforms that don't need
> > this extra work done.  If the extra work is expensive enough (and I'm
> > not sure I could evaluate that properly) it may need to be conditional
> > on whether the platform needs it.  I'm also vaguely uneasy with all this
> > on a purely philosphical level, since this could end up basically
> > infecting MI code with a platform-specific concept.
> 
> Not sure if you've seen this, but we were discussing a long standing
> problem with FFS clustering (actually with multiple mappings) here:
> 
> http://lists.freebsd.org/pipermail/freebsd-arm/2008-December/001423.html
> 
> The changes from Mark were supposed to  somewhat mitigate this (by
> introducing non-cached entries for consecutive mappings of the already
> mapped address ranges), but the problem was still observed and we're
> still using forced nocluster for stability.
> 
> Rafal
> 

I wasn't aware of that thread, thanks.  Reading through it now, it
appears that it may be the discussion that eventually led to the changes
in r209223 and a few followup refinements that disable caching for pages
with multiple mappings.

I've never used nocluster, except briefly when first investigating the
loss of icache on pages, to see if it made a difference.  It made some
difference, mainly by changing the readahead behavior from large cluster
readaheads to the old readahead logic, so a smaller part of a shared lib
became permanently cache-disabled.

We also had a series of battles at Symmetricom with the 32 byte
corruption during IO.  Most of that trouble disappeared when we moved to
the 8.2 code, and the last bit of it went away with the patch I posted
in http://www.freebsd.org/cgi/query-pr.cgi?pr=160431 (which still isn't
committed, but I now have a newer version of the patch that disables
interrupts for a shorter time and still gets the job done).  I also
remember having to fix a few busdma sync calls in various drivers, but I
think they were all our own private drivers except for the at91 mci and
uart drivers.

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1328042962.1662.398.camel>