From owner-freebsd-arm@FreeBSD.ORG  Thu Jun 28 21:56:56 2012
Return-Path: <owner-freebsd-arm@FreeBSD.ORG>
Delivered-To: freebsd-arm@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7E1D3106564A
	for <freebsd-arm@freebsd.org>; Thu, 28 Jun 2012 21:56:56 +0000 (UTC)
	(envelope-from freebsd@damnhippie.dyndns.org)
Received: from qmta07.emeryville.ca.mail.comcast.net
	(qmta07.emeryville.ca.mail.comcast.net [76.96.30.64])
	by mx1.freebsd.org (Postfix) with ESMTP id 5F7AC8FC14
	for <freebsd-arm@freebsd.org>; Thu, 28 Jun 2012 21:56:56 +0000 (UTC)
Received: from omta22.emeryville.ca.mail.comcast.net ([76.96.30.89])
	by qmta07.emeryville.ca.mail.comcast.net with comcast
	id TxqD1j0041vN32cA7xwqv3; Thu, 28 Jun 2012 21:56:50 +0000
Received: from damnhippie.dyndns.org ([24.8.232.202])
	by omta22.emeryville.ca.mail.comcast.net with comcast
	id Txwp1j00E4NgCEG8ixwpru; Thu, 28 Jun 2012 21:56:50 +0000
Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240])
	by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id q5SLulue048812; 
	Thu, 28 Jun 2012 15:56:47 -0600 (MDT)
	(envelope-from freebsd@damnhippie.dyndns.org)
From: Ian Lepore <freebsd@damnhippie.dyndns.org>
To: Alexander Motin <mav@freebsd.org>
In-Reply-To: <4FE2EDBA.1030505@FreeBSD.org>
References: <4FE2EDBA.1030505@FreeBSD.org>
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 28 Jun 2012 15:56:47 -0600
Message-ID: <1340920607.1110.93.camel@revolution.hippie.lan>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 7bit
Cc: freebsd-arm@freebsd.org
Subject: Re: Cache write-back issue on Marvell SoC (SheevaPlug)
X-BeenThere: freebsd-arm@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the StrongARM Processor <freebsd-arm.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
	<mailto:freebsd-arm-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arm>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Help: <mailto:freebsd-arm-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
	<mailto:freebsd-arm-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jun 2012 21:56:56 -0000

On Thu, 2012-06-21 at 12:47 +0300, Alexander Motin wrote:
> Hi.
> 
> Trying to localize regular data corruption during writes (reads seems 
> not affected) to SATA disk on SheevaPlug box I've found out that it is 
> probably result of cache coherency issue. Reading data back shows that 
> each time exactly 32 sequential aligned data bytes are corrupted. That, 
> if I understand correctly, matches single cache line size/offset.
> 
> I've found out that such dirty hack with flushing all D-cache after 
> doing normal bus_dmamap_sync() fixes the situation:
> 
> --- mvs.c       (revision 237359)
> +++ mvs.c       (working copy)
> @@ -1307,6 +1312,10 @@ mvs_dmasetprd(void *arg, bus_dma_segment_t *segs,
>          bus_dmamap_sync(ch->dma.data_tag, slot->dma.data_map,
>              ((slot->ccb->ccb_h.flags & CAM_DIR_IN) ?
>              BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE));
> +#if defined(__arm__)
> +       if (slot->ccb->ccb_h.flags & CAM_DIR_OUT)
> +               cpu_dcache_wbinv_all();
> +#endif
>          if (ch->basic_dma)
>                  mvs_legacy_execute_transaction(slot);
>          else
> 
> Unluckily I have no idea in arm assembler and cache control interfaces. 
> Could somebody recheck existing D-cache range write-back code, because 
> there seems to be a problem?
> 

Since I'm pretty familiar with debugging arm's busdma code, I had a look
at this today.  Nothing is jumping out at me as wrong.  

It appears that the Marvell document describing the MMU commands for
Kirkwood chips is not publicly available (I guess you need a corporate
account or something to get it).  I checked the netbsd implementation
(essentially identical to freebsd), and linux (much simpler code,
apparently we've got room for improvement).  The linux code seems to be
structured to use two different cache flushing schemes, as if different
chip variations might have a different MMU feature set, but I couldn't
find any real information on that.

Have you noticed any pattern in the address of the corrupted blocks?
Especially, is it always the first or last cacheline of the buffer (or
SG segment), or always the first or last line within a page, or anything
like that?  Are there ever multiple corruptions within a single DMA
transfer?  Are the corruptions rare or frequent?  Does it only happen on
large or only on small transfers?

-- Ian