Date: Thu, 23 Feb 2006 13:31:16 +0100 (CET) From: Michael Reifenberger <mike@Reifenberger.com> To: pjd@FreeBSD.org Cc: FreeBSD Stable <freebsd-stable@FreeBSD.org> Subject: graid3 data corruption?!? Message-ID: <20060223131549.V38816@fw.reifenberger.com>
next in thread | raw e-mail | index | archive | help
Hi, I'm having 5 firewire Disks in one graid3 set. and using a fresh STABLE on SMP with an dual AMD64 in i386 mode. While doing an md5 checksum of all files in the filesystem (~770GB of data) on disk died. graid3 did the right thing and disconnected the disk. BUT: after diffing the md5sums of the files on large file (probably the one that got checked during the disk failure) had an different md5sum than before. --- md5_11.log Fri Dec 9 13:23:07 2005 +++ md5_12.log Wed Feb 22 18:03:03 2006 @@ -4460,3 +4460,3 @@ MD5 (Backup/totum/root_0_050211_i386.dmp.gz) = 5a3e7b03f48ea4c2cba10624edd996cf -MD5 (Backup/totum/root_0_050715.dmp.gz) = 0e154301cbec84571d1df94bf68e3d79 +MD5 (Backup/totum/root_0_050715.dmp.gz) = 172d7c12b78f3f191c184d467e31a53c MD5 (RIP/.pgp/PGPMacBinaryMappings.txt) = bf1b637a3a69bcbb8d4177be46a1c3ac BUT: doing a fresh md5sum now in degraded mode of the file I get again (the correct) value of: MD5 (Backup/totum/root_0_050715.dmp.gz) = 0e154301cbec84571d1df94bf68e3d79 For me this means, that graid3 gave incorrect data during the disk los. This shouldn't happen! Any clues how this could happen? Has anyone else seen this behaviour? BTW: dmesg showed: ... GEOM_RAID3: Device data created (id=0). GEOM_RAID3: Device data: provider da5s1a detected. GEOM_RAID3: Device data: provider da4s1a detected. GEOM_RAID3: Device data: provider da3s1a detected. GEOM_RAID3: Device data: provider da2s1a detected. GEOM_RAID3: Device data: provider da1s1a detected. GEOM_RAID3: Device data: provider da1s1a activated. GEOM_RAID3: Device data: provider da2s1a activated. GEOM_RAID3: Device data: provider da4s1a activated. GEOM_RAID3: Device data: provider da3s1a activated. GEOM_RAID3: Device data: provider da5s1a activated. GEOM_RAID3: Device data: provider raid3/data launched. ... (da2:sbp0:0:0:0): READ(10). CDB: 28 0 9 3f 46 6f 0 0 40 0 (da2:sbp0:0:0:0): CAM Status: SCSI Status Error (da2:sbp0:0:0:0): SCSI Status: Check Condition (da2:sbp0:0:0:0): ABORTED COMMAND asc:0,0 (da2:sbp0:0:0:0): No additional sense information (da2:sbp0:0:0:0): Retrying Command (per Sense Data) (da2:sbp0:0:0:0): READ(10). CDB: 28 0 9 3f 46 6f 0 0 40 0 (da2:sbp0:0:0:0): CAM Status: SCSI Status Error (da2:sbp0:0:0:0): SCSI Status: Check Condition (da2:sbp0:0:0:0): MEDIUM ERROR asc:4b,0 (da2:sbp0:0:0:0): Data phase error (da2:sbp0:0:0:0): Retrying Command (per Sense Data) (da2:sbp0:0:0:0): READ(10). CDB: 28 0 9 3f 46 6f 0 0 40 0 (da2:sbp0:0:0:0): CAM Status: SCSI Status Error (da2:sbp0:0:0:0): SCSI Status: Check Condition (da2:sbp0:0:0:0): ABORTED COMMAND asc:0,0 (da2:sbp0:0:0:0): No additional sense information (da2:sbp0:0:0:0): Retrying Command (per Sense Data) (da2:sbp0:0:0:0): READ(10). CDB: 28 0 9 3f 46 6f 0 0 40 0 (da2:sbp0:0:0:0): CAM Status: SCSI Status Error (da2:sbp0:0:0:0): SCSI Status: Check Condition (da2:sbp0:0:0:0): MEDIUM ERROR asc:4b,0 (da2:sbp0:0:0:0): Data phase error (da2:sbp0:0:0:0): Retrying Command (per Sense Data) (da2:sbp0:0:0:0): READ(10). CDB: 28 0 9 3f 46 6f 0 0 40 0 (da2:sbp0:0:0:0): CAM Status: SCSI Status Error (da2:sbp0:0:0:0): SCSI Status: Check Condition (da2:sbp0:0:0:0): ABORTED COMMAND asc:0,0 (da2:sbp0:0:0:0): No additional sense information (da2:sbp0:0:0:0): Retries Exhausted GEOM_RAID3: Request failed. da2s1a[READ(offset=79432531968, length=32768)] GEOM_RAID3: Device data: provider da2s1a disconnected. GEOM_RAID3: Request failed. da2s1a[READ(offset=79432761344, length=32768)] GEOM_RAID3: Device data: provider [unknown] disconnected. GEOM_RAID3: Request failed. da2s1a[READ(offset=79432695808, length=32768)] GEOM_RAID3: Device data: provider [unknown] disconnected. GEOM_RAID3: Request failed. da2s1a[READ(offset=79432663040, length=32768)] GEOM_RAID3: Device data: provider [unknown] disconnected. GEOM_RAID3: Request failed. da2s1a[READ(offset=79432630272, length=32768)] GEOM_RAID3: Device data: provider [unknown] disconnected. GEOM_RAID3: Request failed. da2s1a[READ(offset=79432597504, length=32768)] GEOM_RAID3: Device data: provider [unknown] disconnected. ... (da2:sbp0:0:0:0): READ(10). CDB: 28 0 9 3f 46 80 0 0 40 0 (da2:sbp0:0:0:0): CAM Status: SCSI Status Error (da2:sbp0:0:0:0): SCSI Status: Check Condition (da2:sbp0:0:0:0): MEDIUM ERROR asc:4b,0 (da2:sbp0:0:0:0): Data phase error (da2:sbp0:0:0:0): Retrying Command (per Sense Data) (da2:sbp0:0:0:0): READ(10). CDB: 28 0 9 3f 46 80 0 0 40 0 (da2:sbp0:0:0:0): CAM Status: SCSI Status Error (da2:sbp0:0:0:0): SCSI Status: Check Condition (da2:sbp0:0:0:0): MEDIUM ERROR asc:4b,0 (da2:sbp0:0:0:0): Data phase error (da2:sbp0:0:0:0): Retrying Command (per Sense Data) The last cam errors are during `dd`. Bye/2 --- Michael Reifenberger, Business Development Manager SAP-Basis, Plaut Consulting Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com http://www.plaut.de | http://www.Reifenberger.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060223131549.V38816>