From owner-freebsd-embedded@FreeBSD.ORG  Fri Dec 19 14:30:09 2008
Return-Path: <owner-freebsd-embedded@FreeBSD.ORG>
Delivered-To: embedded@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5E3151065679;
	Fri, 19 Dec 2008 14:30:09 +0000 (UTC)
	(envelope-from gjb@semihalf.com)
Received: from semihalf.com (semihalf.com [206.130.101.55])
	by mx1.freebsd.org (Postfix) with ESMTP id B530B8FC29;
	Fri, 19 Dec 2008 14:30:08 +0000 (UTC)
	(envelope-from gjb@semihalf.com)
Received: from mail.semihalf.com (mail.semihalf.com [83.15.139.206])
	by semihalf.com (8.13.1/8.13.1) with ESMTP id mBJE6qrB003290;
	Fri, 19 Dec 2008 07:06:53 -0700
Message-ID: <494BAA90.7000801@semihalf.com>
Date: Fri, 19 Dec 2008 15:07:12 +0100
From: Grzegorz Bernacki <gjb@semihalf.com>
MIME-Version: 1.0
To: arm@freebsd.org, embedded@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: 
Subject: Multiple virtual mappings considered harmful on ARM
X-BeenThere: freebsd-embedded@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Dedicated and Embedded Systems <freebsd-embedded.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-embedded>, 
	<mailto:freebsd-embedded-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-embedded>
List-Post: <mailto:freebsd-embedded@freebsd.org>
List-Help: <mailto:freebsd-embedded-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-embedded>, 
	<mailto:freebsd-embedded-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Dec 2008 14:30:09 -0000

Hi,

I've investigated lately problem with data corruption when copying big files
on ARM machine. Below are my findings.

1. High level scenario.
Problem occurs during copying of big files (~300MB and more). Calculated MD5
checksums for original and copied files are different. Chunks of data which
get corrupted have always 32 bytes in length i.e. cache line length.

2. Root cause.
The root cause of the problem is additional virtual mapping of read/write
buffers at cluster read/write (sys/kern/vfs_cluster.c, cluster_rbuild(),
cluster_wbuild(). Buffers for sequential read/write operation are concatenated
and sent to device as one big buffer. Concatenation of buffers uses
pmap_qenter(), which puts *additional* mapping in the KVA for physical area
already mapped. For each buffer we extract pages it contains and then all the
pages from all the buffers are mapped into new virtual address of new buffer.
So we end up with at least two virtual addresses for each page.

Such scenario on systems with virtual cache (most ARMs) leads to
serious problems: we can have unflushed modified data pertaining to the same
physical pages cached in separate cache blocks: data written back first
(associated with virtual mapping #1) is potentially overwritten by data
associated with virtual mapping #2 when its cache content is written back
later, or vice versa.

3. Workaround for FFS read/write problems - avoid clustered reading/writing on
ARM:

# mount -o noclusterr -o noclusterw /dev/da0a /mnt/


4. More general solution.
This is the second time we indentified a problem of the same nature related to
multiple virtual mapping on ARM, and are wondering about some more general
solution that would prevent us from such problems (very subtle and hard to
nail down) in the future. We were thinking at least about an extension to
DIAGNOSTIC that would detect such attempts or so. Any other suggestions or
comments welcome.