From owner-freebsd-ppc@FreeBSD.ORG Mon Dec 24 00:23:02 2012 Return-Path: Delivered-To: powerpc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9FE02C3 for ; Mon, 24 Dec 2012 00:23:02 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-pa0-f45.google.com (mail-pa0-f45.google.com [209.85.220.45]) by mx1.freebsd.org (Postfix) with ESMTP id 64D278FC1A for ; Mon, 24 Dec 2012 00:23:02 +0000 (UTC) Received: by mail-pa0-f45.google.com with SMTP id bg2so3778865pad.18 for ; Sun, 23 Dec 2012 16:22:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:date:from:x-x-sender:to:cc:subject:in-reply-to :message-id:references:user-agent:mime-version:content-type :x-gm-message-state; bh=MBCwf05DwGqYCM45Cih6/F2x9xScFQ0qDCgMUHgMC8M=; b=pNg46GEZZOpka320uKlYWzaT2VPbl2ZKDGS3Zvf0Pk2FJiuPkH4fmxZdHIEaSqDgbp 4b7BfCzFnhA4mPSzkJhLUnVOc8qnlmF7g7/sI1WI/6FmmLkN0RZ4miiMlfYKddiXefBc 7X70FJtmONstmLkIw5WCL0GWenoT66BTVf1xA7GZH1KbDDehvaTV/Xgxmyetr5bMJlCi deEHQVjyGJAYSNK1zEViv9eEVY72bSdoCAysfoApFHGBocLqKBAc4HHpqqu/Tu23MaA8 gg37c3AoVnZUEY9rtnOladRAQ1BunnyBZlcrE2DDkro+3bGrAcTSCWxjQsQUBs6ZtlFx eanA== X-Received: by 10.66.83.136 with SMTP id q8mr57987918pay.83.1356308576649; Sun, 23 Dec 2012 16:22:56 -0800 (PST) Received: from rrcs-66-91-135-210.west.biz.rr.com (rrcs-66-91-135-210.west.biz.rr.com. [66.91.135.210]) by mx.google.com with ESMTPS id pj1sm11235205pbb.71.2012.12.23.16.22.53 (version=SSLv3 cipher=OTHER); Sun, 23 Dec 2012 16:22:55 -0800 (PST) Date: Sun, 23 Dec 2012 14:22:51 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Ian Lepore Subject: Re: Call for testing and review, busdma changes In-Reply-To: <1355085250.87661.345.camel@revolution.hippie.lan> Message-ID: References: <1355077061.87661.320.camel@revolution.hippie.lan> <1355085250.87661.345.camel@revolution.hippie.lan> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Gm-Message-State: ALoCoQljUyMNNZTv5nAINVwihpa+3DQ8APlOn5tA+NYNh3PeakLLSArIsEyeCIWHANC+uaNILQiV X-Mailman-Approved-At: Mon, 24 Dec 2012 00:39:03 +0000 Cc: powerpc@freebsd.org, marcel@freebsd.org, mips@freebsd.org, mav@freebsd.org, scottl@freebsd.org, attilio@freebsd.org, kib@freebsd.org, sparc64@freebsd.org, arm@freebsd.org X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Dec 2012 00:23:02 -0000 On Sun, 9 Dec 2012, Ian Lepore wrote: > On Sun, 2012-12-09 at 08:48 -1000, Jeff Roberson wrote: >> >> On Sun, 9 Dec 2012, Ian Lepore wrote: >> >>> On Sat, 2012-12-08 at 08:51 -1000, Jeff Roberson wrote: >>>> Hello, >>>> >>>> http://people.freebsd.org/~jeff/physbio.diff >>>> >>>> I have a relative large patch that reforms the busdma API so that new >>>> types may be added without modifying every architecture's >>>> busdma_machdep.c. It does this by unifying the bus_dmamap_load_buffer() >>>> routines so that they may be called from MI code. The MD busdma is then >>>> given a chance to do any final processing in the complete() callback. >>>> This patch also contains cam changes to unify the bus_dmamap_load* >>>> handling in cam drivers. >>>> >>>> The arm and mips implementations saw the largest changes since they have >>>> to track virtual addresses for sync(). Previously this was done in a type >>>> specific way. Now it is done in a generic way by recording the list of >>>> virtuals in the map. >>>> >>>> I have verified that this patch passes make universe which includes >>>> several kernel builds from each architecture. I suspect that if I broke >>>> anything your machine simply won't boot or will throw I/O errors. There >>>> is little subtlety, it is mostly refactoring. >>>> > > First an FYI, my replies to this thread aren't making it to the mailing > lists, except when quoted by someone replying to them. They're being > held for moderator approval because they're going to too many addresses. > >>> >>> Testing on armv4/v5 platforms, the patches applied and compiled cleanly, >>> but fault-abort with a NULL deref on what appears to be the first call >>> to bus_dmamap_load(). I'll dig deeper into debugging that after >>> browsing the new code so I can figure out where to start debugging. >> >> Can you give me the file and line of the fault? >> > > Better than that, I can give you patches (attached). There were two > little glitches... the allocated slist wasn't getting attached to the > map, and when building the slist in _bus_dmamap_load_buffer() the slist > needs to participate in the coalescing logic near the end of the loop. I understand the first problem. The second, I'm not sure about. When we're going through bounce pages we use virtual to virtual. Why do you need to flush the cache lines for the source addresses? The device will only do io to the bounce pages so they are the only that need other synchronization. > > I'm not positive the attached quick-fix is perfect, but it allows one of > my arm units here to boot and run and pass some basic IO smoke tests. > I'll be trying it on other arm platforms later today. > >>> >>> Just from an initial quick browsing of the new code for armv4, I notice >>> these things... >>> >>> The new routine __bus_dmamap_mayblock() does (*flags) |= BUS_DMA_WAITOK; >>> which is a no-op because BUS_DMA_WAITOK is zero. I'm not sure what the >>> intent was, but I think any modification of the WAIT-related flags would >>> be conceptually wrong because it overrides the caller's instructions on >>> whether or not to do a deferred load when resources aren't available. >> >> The intent of the original load function seems to be that it is always >> blocking. Most architectures force the flag. The mbuf and uio load >> routines don't support blocking at all because the callback lacks the >> information needed to restart the operation. >> > > The manpage states that bus_dmamap_load() never blocks for any reason. > The mbuf and uio flavors say they are variations of bus_dmapmap_load(); > I take the implication of that to mean that they also are defined as > never blocking. > > The docs then say that WAITOK vs. NOWAIT control the scheduling of a > deferred load, and that NOWAIT is forced in the mbuf and uio cases. > Your refactored code already handles that by adding NOWAIT to the > caller-specified flags for mbufs and uio. > > So the code will do the right thing now (with your patches), but only > because (*flags) |= BUS_DMA_WAITOK is a no-op. This code is problematic. We can only support WAITOK operations for bus_dmamap_load() because the callbacks from the swi when bounce pages are available can only remember one pointer. To fix this for real we'd have to remember the original type and call the appropriate load function again. This could be solved by having a { type, pointer, length } structure that is passed between the front and back-end. This would solve your mbuf problem as well. I think I will do this next. By the way I have an mbuf branch that pushes data and mbuf structure into separate cachelines. It's not only faster on arm and the like but on x86 as well. So someone should pursue this and you'd have way fewer partial line flushes. > >>> >>> The way you've refactored the mbuf-related load routines, the MD code is >>> no longer able to tell that it's an mbuf operation. This comes back to >>> what I said a few days ago... when you look at the current code on >>> various platforms you see a lot of false similarities. The code isn't >>> nearly identical because it really all needs to be the same, it's >>> because it was all copied from the same source, and it doesn't currently >>> work correctly on arm and mips. With your changes, the ability to >>> correct the implementation on those platforms is now gone. Perhaps this >>> can be fixed easily by defining some new flags that the MI routines can >>> OR-in to give the MD routines more info on what's happening. (Correcting >>> the mbuf sync handling on arm and mips requires that we know at sync >>> time that the buffers involved are mbufs, which means we need to know at >>> map-load time so we can set a flag or a type code or something in the >>> map that we can examine at sync time.) >> >> Why do you need to know it is an mbuf if you have a record of the virtual >> addresses? The only type specific information was used to find the >> addresses. See how arm's busdma_machdep-v6 ws implemented. >> > > Because there is a special rule for mbufs on VIVT-cache architectures: > the address of the data buffer is not aligned to a cacheline boundary, > but the sync code is allowed to treat the buffer as if it is aligned and > thus avoid invoking the partial cacheline flush algorithm. That > algorithm was shown a few months ago to be fatally flawed, and we're on > a path (albeit in super-slow-motion) to eliminate it. You can skip the sync because you know the information in the mbuf header is not necessary? > > I have patches to fix the mbuf partial sync and other things related to > partial sync problems, and my next step will be to try to rework them to > fit in with what you've done. Based on what I've seen so far in your > refactoring, I think just passing a flag to say it's an mbuf, from the > MI to the MD mapping routine, will provide enough info. I think I described a better approach above that will solve mutliple problems. Let me know what you think. Jeff > >>> >>>> The next step is to allow for dma loading of physical addresses. This >>>> will permit unmapped I/O. Which is a significant performance optimization >>>> targeted for 10.0. >>> >>> This sounds scary for arm and mips, or more specifically for VIVT cache >>> platforms that can only do a sync based on virtual addresses. Can you >>> talk some more about the plans for this? >> >> It will be for addresses which were never mapped into kva. To support >> unmaapped io. There will only be a need for bounce pages, no syncing. >> > > Can the pages be mapped into any non-kernel vmspaces? If they aren't > mapped into any vmspace at all, then I think there'll be no implications > for VIVT cache architectures. If they appear in any pmap at all then > there may be problems. > > -- Ian > >