From owner-freebsd-arm@FreeBSD.ORG Sun Aug 26 23:03:56 2012 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6456F1065674 for ; Sun, 26 Aug 2012 23:03:56 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 19A0C8FC18 for ; Sun, 26 Aug 2012 23:03:55 +0000 (UTC) Received: by ialo14 with SMTP id o14so8766147ial.13 for ; Sun, 26 Aug 2012 16:03:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=ep5IxGfVukJ1EG6I0J6d7Tuvp5WIjgyi6rZXdyCg3ho=; b=QQ7WmvPU2WodulTQYYKX6hhxhiBv7oV6CG53ZAMEPvw1ubSa6PLcLCoz6zuM20T785 KgQ9P1vToAv5C97ji0xH0Xzl7e33WGkb+HtNWBHHc7LHDlnx68TSNin51hYHIxY1qoC3 yn6N+9cMQxy9ubTDKL4/m22divtu17eX053VtlaB16dHY9LUMMQeB/9AxFEAY4ks0bhQ 9PCzfTTPwm2HR2fi2KuZM8IyYWiai94tFgCUqmkAMkD6ZjbRbBEHTiMNcpr/JIy0ibW9 vwIpPwZwPB/TNTonhMwnasWypaWG7sXwUEzpGsq1KpPsn1uxKbc+svPt132pwvQNyIzW qZbg== Received: by 10.50.184.198 with SMTP id ew6mr8457358igc.27.1346022235175; Sun, 26 Aug 2012 16:03:55 -0700 (PDT) Received: from 63.imp.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPS id ce6sm4966875igb.1.2012.08.26.16.03.48 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 26 Aug 2012 16:03:54 -0700 (PDT) Sender: Warner Losh Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: <1346002922.1140.56.camel@revolution.hippie.lan> Date: Sun, 26 Aug 2012 17:03:46 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <6D83AF9D-577B-4C83-84B7-C4E3B32695FC@bsdimp.com> References: <1345757300.27688.535.camel@revolution.hippie.lan> <3A08EB08-2BBF-4B0F-97F2-A3264754C4B7@bsdimp.com> <1345763393.27688.578.camel@revolution.hippie.lan> <1345765503.27688.602.camel@revolution.hippie.lan> <1345766109.27688.606.camel@revolution.hippie.lan> <1346002922.1140.56.camel@revolution.hippie.lan> To: Ian Lepore X-Mailer: Apple Mail (2.1084) X-Gm-Message-State: ALoCoQn/huctD5GlP4QXVfkG3vdjWo3yJHEPUJbjv9819yMvrY5kmThn9afVeZ3Iv1poOA3Sxt3U Cc: Hans Petter Selasky , freebsd-arm@freebsd.org, freebsd-mips@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Partial cacheline flush problems on ARM and MIPS X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Aug 2012 23:03:56 -0000 On Aug 26, 2012, at 11:42 AM, Ian Lepore wrote: > On Thu, 2012-08-23 at 22:00 -0600, Warner Losh wrote: >> The bottom line is that you can't mix things like that when cache >> lines are involved. The current code that tries is doomed to = failure. >> Doomed. You just can't control all flushes, as Ian's missive >> demonstrates, and trying to accommodate code that does this I don't >> think can possibly work. All the interrupt masking, copying in and >> out, etc I fear is doomed to utter and abject failure. =20 >>=20 > Until last weekend I was in the camp that thought the partial = cacheline > flush problem was solvable with sufficiently clever code. Now I agree > that we're doomed to failure and it's time to try another direction. >=20 > We're going to have some implementation work to do in arm and mips > busdma, but I think the larger part of the task is going to be = defining > more rigorously how a driver must interact with the busdma system to > function correctly on all types of platforms, and then update existing > drivers to conform. >=20 > The busdma manpage currently has some vague words about the usage and > sequencing of sync ops, such as "If read and write operations are not > preceded and followed by the appropriate synchronization operations, > behavior is undefined." I think we should more explicitly spell out > what the appropriate sequences are. In particular: >=20 > * The PRE and POST operations must occur in pairs; a PREREAD must > be followed eventually by a POSTREAD and a PREWRITE must be > followed by a POSTWRITE.=20 PREREAD means "I am about to tell the device to put data here, have = whaterver things might be pending in the CPU complex to get out of the = way." usually this means 'invalidate the cache for that range', but not = always. POSTREAD means 'The device's DMA is done, I'd like to start = accessing it now.' If the memory will be thrown away without being = looked at, then does the driver necessarily need to issue the POSTREAD? = I think so, but I don't know if that's a new requirement. > * The CPU is not allowed to access the mapped memory after a PRE > sync and before the corresponding POST sync. =20 Correct. > * The DMA hardware is not allowed to access the mapped memory > after a POST sync and before the next PRE sync.=20 Correct. > * Read and write sync operators may be combined in a single call, > PRE and POST operators may not be. E.G., PREREAD|PREWRITE is > allowed, PREREAD|POSTREAD is not. We should note that while > read and write operations may be combined, on some platforms > PREREAD|PREWRITE is needlessly expensive when only a read is > being performed. Correct. > We also need some rules about working with buffers obtained from > bus_dmamem_alloc() and external buffers passed to bus_dmamap_load(). = I > think the rule should be that a buffer obtained from = bus_dmamem_alloc(), > or more formally any region of memory mapped by a bus_dmamap_load(), = is > a single logical object which can only be accessed by one entity at a > time. That means that there cannot be two concurrent DMA operations > happening in different regions of the same buffer, nor can DMA and CPU > access be happening concurrently even if in different parts of the > buffer. =20 There's something subtle that I'm missing. Why would two DMA operations = be disallowed? The rest makes good sense. > I've always thought that allocating a dma buffer feels like a big > hassle. You sometimes have to create a tag for the sole purpose of > setting the maxsize to get the buffer size you need when you call > bus_dmamem_alloc(). If bus_dmamem_alloc() took a size parm you could > just use your parent tag, or a generic tag appropriate to all the IO > you're doing for a given device. If you need a variety of buffers for > small control and command and status transfers of different sizes, you > end up having to manage up to a dozen tags and maps and buffers. It's > all very clunky and inconvenient. It's just the sort of thing that > makes you want to allocate a big buffer and subdivide it. Surely we > could do something to make it easier? You'd wind up creating a quick tag on the fly for the bus_dmamap_alloc = if you wanted to do this. Cleanup then becomes unclear. Warner