From owner-freebsd-fs@FreeBSD.ORG Sat Nov 16 02:56:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3A0A7DB8; Sat, 16 Nov 2013 02:56:21 +0000 (UTC) Received: from mail-pa0-x232.google.com (mail-pa0-x232.google.com [IPv6:2607:f8b0:400e:c03::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0E47925FE; Sat, 16 Nov 2013 02:56:21 +0000 (UTC) Received: by mail-pa0-f50.google.com with SMTP id kp14so2814214pab.9 for ; Fri, 15 Nov 2013 18:56:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:subject:date:message-id:mime-version:content-type :thread-index; bh=piMkAEWDSr3k6QkACvuO1D4+jM55vnU/nBArjnWTKsU=; b=XssvStcoSeInI1Or1c5qUdKUj0tZnmon9Kw9tosAFhXt1GziNnsncSfQ2m/CPl46j7 DveeL2/oVFygfQv6YVfmm1TwCO9g0GH6CMqTAlVKB4v6HhVZHHAicENXvC/H0s0UXl22 L/tRAcwbiBWQbY5tSAbAenZ8e3OE2O7qCq1zHrXeFjxO7GIm1rNMJ9D2CnLmgo7jjhbt TgkVONwQLTb5DgCH6myhvyV39PLW/gxdNGqwHlQzmGm1ySZmcsbsshXw1zctZU9glLhg oP6mNbUhmjjdHx33O9yNIjnmvdS9U079B5Os2UMduRZsz3fv0FFXSA3cZLomjxbDCziJ dA9w== X-Received: by 10.66.190.10 with SMTP id gm10mr9748618pac.126.1384570580616; Fri, 15 Nov 2013 18:56:20 -0800 (PST) Received: from d40 (34.sub-70-197-84.myvzw.com. [70.197.84.34]) by mx.google.com with ESMTPSA id og5sm7668227pbb.10.2013.11.15.18.56.17 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Fri, 15 Nov 2013 18:56:19 -0800 (PST) From: "John Refling" To: , Subject: rare, random issue with read(), mmap() failing to read entire file Date: Fri, 15 Nov 2013 18:56:09 -0800 Message-ID: <9CB46A22C0BE40029652144B2586462A@d40> MIME-Version: 1.0 X-Mailer: Microsoft Office Outlook 11 Thread-Index: Ac7id2bOvS8Nq/MiRSaxrAx6DDPDxQ== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.16 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Nov 2013 02:56:21 -0000 I'm having some very insidious issues with copying and verifying (identical) data from several hard disks. This might be a hardware issue or something very deep in the disk / filesystem code. I have verified this with several disks and motherboards. It corrupts 0.0096% of my files, different files each time! Background: 1. I have a 500 GB USB hard disk (the new 4,096 [4k] sector size) which I have been using to store a master archive of over 70,000 files. 2. To make a backup of the USB disk, I copied everything over to a 500 GB SATA hard disk. [Various combinations of `cp -r', `scp -r', `tar -cf - . | rsh ... tar -xf -', etc.] 3. To verify that the copy was correct, I did sha256 sums of all files on both disks. 4. When comparing the sha256 sums on both drives, I discovered that 6 or so files did not compare OK from one drive to the other. 5. When I checked the files individually, the files compared OK, and even when I recomputed their individual sha256 sums, I got DIFFERENT sha256 sums which were correct this time! The above lead me to investigate further, and using ONLY the USB disk, I recomputed the sha256 sums for all files ON THAT DISK. A small number (6-12) of files ON THE SAME DISK had different sha256 sums than previously computed! The disk is read-only so nothing could have changed. To try to get to the bottom of this, I took the sha256 code and put it in my own file reading routine, which reads-in data from the file using read(). On summing up the total bytes read in the read() loop, I discovered that on the files that failed to compare, the read() returned EOF before the actual EOF. According to the manual page this is impossible. I compared the total number of bytes read by the read() loop to the stat() file length value, and they were different! Obviously, the sha256 sum will be different since not all the file is read. This happens consistently on 6 to 12 files out of 70,000+ *every* time, and on DIFFERENT files *every* time. So things work 99.9904% of the time. But something fails 0.0096% (one hundredth of one percent) of the time, which with a large number of files is significant! Instead of read(), I tried mmap()ing chunks of the file. Using mmap() to access the data in the file instead of read() resulted in a (different) sha256 sum than the read() version! The mmap() version was correct, except in ONE case where BOTH versions were WRONG, when compared to a 3rd and 4th run! Using `diff -rq disk1 disk2` resulted in similar issues. There were always a few files that failed to compare. Doing another `diff -rq disk1 disk2` resulted in a few *other* files that failed to compare, while the ones that didn't compare OK the first time, DID compare OK the second time. This happened to 6-12 files out of 70,000+. Whatever is affecting my use of read() in my sha256 routine seems to also affect system utilities such as diff! This gets really insidious because I don't know if the original `cp -r disk1 disk2` did these short reads on a few files while copying the files, thus corrupting my archive backup (on 6-12 files)! Some of the files that fail are small (10KB) and some are huge (8GB). HELP! It takes 7 hours to recompute the sha256 sums of the files on the disk so random experiments are time consuming, but I'm willing to try things that are suggested. System details: This is observed with the following disks: Western Digital 500GB SATA 512 byte sectors Hitachi 500GB SATA 512 byte sectors Iomega RPHD-UG3 500GB USB 4096 byte sectors in combination with these motherboards: P4M800Pro-M V2.0: Pentium D 2.66 GHz, 2GB memory HP/Compaq Evo: Pentium 4, 2.8 GHz, 2GB memory OP System version: Freebsd: 9.1 RELEASE #0 no hardware errors noted in /var/log/messages during the file reading did Spinrite on disks to freshen (re-read/write) all sectors, with no errors. The file systems were built using: dd if=/dev/zero of=/dev/xxx bs=2m newfs -m0 /dev/xxx Looked through the mailing lists and bug reports but can't see anything similar. Thanks for your help, John Refling