Date: Fri, 19 Dec 2008 15:07:12 +0100 From: Grzegorz Bernacki <gjb@semihalf.com> To: arm@freebsd.org, embedded@freebsd.org Subject: Multiple virtual mappings considered harmful on ARM Message-ID: <494BAA90.7000801@semihalf.com>
next in thread | raw e-mail | index | archive | help
Hi, I've investigated lately problem with data corruption when copying big files on ARM machine. Below are my findings. 1. High level scenario. Problem occurs during copying of big files (~300MB and more). Calculated MD5 checksums for original and copied files are different. Chunks of data which get corrupted have always 32 bytes in length i.e. cache line length. 2. Root cause. The root cause of the problem is additional virtual mapping of read/write buffers at cluster read/write (sys/kern/vfs_cluster.c, cluster_rbuild(), cluster_wbuild(). Buffers for sequential read/write operation are concatenated and sent to device as one big buffer. Concatenation of buffers uses pmap_qenter(), which puts *additional* mapping in the KVA for physical area already mapped. For each buffer we extract pages it contains and then all the pages from all the buffers are mapped into new virtual address of new buffer. So we end up with at least two virtual addresses for each page. Such scenario on systems with virtual cache (most ARMs) leads to serious problems: we can have unflushed modified data pertaining to the same physical pages cached in separate cache blocks: data written back first (associated with virtual mapping #1) is potentially overwritten by data associated with virtual mapping #2 when its cache content is written back later, or vice versa. 3. Workaround for FFS read/write problems - avoid clustered reading/writing on ARM: # mount -o noclusterr -o noclusterw /dev/da0a /mnt/ 4. More general solution. This is the second time we indentified a problem of the same nature related to multiple virtual mapping on ARM, and are wondering about some more general solution that would prevent us from such problems (very subtle and hard to nail down) in the future. We were thinking at least about an extension to DIAGNOSTIC that would detect such attempts or so. Any other suggestions or comments welcome.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?494BAA90.7000801>