From owner-freebsd-fs@FreeBSD.ORG Thu Oct 18 01:28:10 2012 Return-Path: Delivered-To: FS@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7F2ABB50 for ; Thu, 18 Oct 2012 01:28:10 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id 15FC38FC14 for ; Thu, 18 Oct 2012 01:28:10 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9I1S2B2019779; Wed, 17 Oct 2012 18:28:02 -0700 (PDT) (envelope-from freebsd@pki2.com) Subject: Re: ZFS raidz2, errors in file? From: Dennis Glatting To: Heikki Suonsivu In-Reply-To: <507EED58.80409@suonsivu.net> References: <507EED58.80409@suonsivu.net> Content-Type: text/plain; charset="ISO-8859-1" Date: Wed, 17 Oct 2012 18:28:02 -0700 Message-ID: <1350523682.3968.6.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9I1S2B2019779 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-SpamScore: 2 X-MailScanner-From: freebsd@pki2.com Cc: FS@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 01:28:10 -0000 On Wed, 2012-10-17 at 20:39 +0300, Heikki Suonsivu wrote: > Summary: This could indicate that in some situations ZFS or FreeBSD > ignore disk errors in some situations, or disks do not pass them to > FreeBSD. > > There are three SCSI timeouts all on different disks reported, though it > seems unlikely they hit all three redundancies of a single block of > single file, so the more likely scenario would be that there are other > errors causing this, which are not reported or are ignored. Also, > timeouts should trigger retries. > > SMART data indicates problems on two other disks, but no indication of > those are seen in logs (the disks work, but SMART information indicates > problems). The fs is 8*3T pool on raidz2. > I can't say about your disks but I had nine 4TB Hitachi disks suffering a 30% failure rate in their first year and consistently, and randomly, being marked off-line by ZFS in a RAIDz2 pool. I eventually stripped those disks from that role, replacing them with 3TB Seagates (the Hitachies appear to work find in a ZFS mirror). I haven't experienced the same failure since I replaced the Hitachies with Seagate. > Further data below: > > ------------------------------------------------------------------------ > zpool status -v > pool: raid6 > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > raid6 ONLINE 0 0 0 > raidz2-0 ONLINE 0 0 0 > da0 ONLINE 0 0 0 > da1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > > errors: Permanent errors have been detected in the following files: > > /raid6/backup/rsnapshot/daily.1/foobar.net/users/mail/foo > ------------------------------------------------------------------------ > > All disks online, so this is kind of odd. > > One disk indeed has pending sector, not unusual and should be survivable: > > ------------------------------------------------------------------------ > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail > Always - 0 > 3 Spin_Up_Time 0x0027 162 154 021 Pre-fail > Always - 8891 > 4 Start_Stop_Count 0x0032 100 100 000 Old_age > Always - 46 > 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail > Always - 0 > 7 Seek_Error_Rate 0x002e 100 253 000 Old_age > Always - 0 > 9 Power_On_Hours 0x0032 094 094 000 Old_age > Always - 4776 > 10 Spin_Retry_Count 0x0032 100 253 000 Old_age > Always - 0 > 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age > Always - 43 > 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always > - 40 > 193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always > - 13595 > 194 Temperature_Celsius 0x0022 119 106 000 Old_age Always > - 33 > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always > - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always > - 1 > 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age > Offline - 1 > 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always > - 0 > 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age > Offline - 1 > > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > No self-tests have been logged. [To run self-tests, use: smartctl -t] > ------------------------------------------------------------------------ > > But there is no corresponding read error in /var/log/messages, which > would indicate that FreeBSD never saw the error, or ignored it. > > In addition, there seems to be ICRC DMA errors on da0. Looks nasty, but > only show up in SMART log, not in /var/log/messages. > > ------------------------------------------------------------------------ > smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.0-RELEASE-p3 amd64] (local build) > Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net > > === START OF INFORMATION SECTION === > Model Family: Hitachi Deskstar 7K3000 > Device Model: Hitachi HDS723030ALA640 > Serial Number: MK0331YHGVDK6A > LU WWN Device Id: 5 000cca 225cc00c7 > Firmware Version: MKAOA5C0 > User Capacity: 3,000,592,982,016 bytes [3.00 TB] > Sector Size: 512 bytes logical/physical > Device is: In smartctl database [for details use: -P show] > ATA Version is: 8 > ATA Standard is: ATA-8-ACS revision 4 > Local Time is: Wed Oct 17 20:03:05 2012 EEST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF READ SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > General SMART Values: > Offline data collection status: (0x84) Offline data collection activity > was suspended by an > interrupting command from host. > Auto Offline Data Collection: > Enabled. > Self-test execution status: ( 0) The previous self-test routine > completed > without error or no self-test > has ever > been run. > Total time to complete Offline > data collection: (29509) seconds. > Offline data collection > capabilities: (0x5b) SMART execute Offline immediate. > Auto Offline data collection > on/off support. > Suspend Offline collection upon new > command. > Offline surface scan supported. > Self-test supported. > No Conveyance Self-test supported. > Selective Self-test supported. > SMART capabilities: (0x0003) Saves SMART data before entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability: (0x01) Error logging supported. > General Purpose Logging supported. > Short self-test routine > recommended polling time: ( 1) minutes. > Extended self-test routine > recommended polling time: ( 492) minutes. > SCT capabilities: (0x003d) SCT Status supported. > SCT Error Recovery Control > supported. > SCT Feature Control supported. > SCT Data Table supported. > > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail > Always - 0 > 2 Throughput_Performance 0x0005 136 136 054 Pre-fail > Offline - 83 > 3 Spin_Up_Time 0x0007 125 125 024 Pre-fail > Always - 617 (Average 617) > 4 Start_Stop_Count 0x0012 100 100 000 Old_age > Always - 62 > 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail > Always - 0 > 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail > Always - 0 > 8 Seek_Time_Performance 0x0005 135 135 020 Pre-fail > Offline - 26 > 9 Power_On_Hours 0x0012 100 100 000 Old_age > Always - 4570 > 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age > Always - 30 > 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always > - 129 > 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always > - 129 > 194 Temperature_Celsius 0x0002 166 166 000 Old_age Always > - 36 (Min/Max 25/50) > 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always > - 0 > 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always > - 0 > 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always > - 112 > > SMART Error Log Version: 1 > ATA Error Count: 112 (device log contains only the most recent five errors) > CR = Command Register [HEX] > FR = Features Register [HEX] > SC = Sector Count Register [HEX] > SN = Sector Number Register [HEX] > CL = Cylinder Low Register [HEX] > CH = Cylinder High Register [HEX] > DH = Device/Head Register [HEX] > DC = Device Command Register [HEX] > ER = Error register [HEX] > ST = Status register [HEX] > Powered_Up_Time is measured from power on, and printed as > DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, > SS=sec, and sss=millisec. It "wraps" after 49.710 days. > > Error 112 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 84 51 20 28 c5 ae 06 Error: ICRC, ABRT 32 sectors at LBA = > 0x06aec528 = 112117032 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > 35 03 f0 58 c4 ae 40 00 2d+04:00:39.416 WRITE DMA EXT > 35 03 20 38 c4 ae 40 00 2d+04:00:39.409 WRITE DMA EXT > 35 03 f0 48 c3 ae 40 00 2d+04:00:39.402 WRITE DMA EXT > 35 03 f0 58 c2 ae 40 00 2d+04:00:39.394 WRITE DMA EXT > 35 03 60 f8 c1 ae 40 00 2d+04:00:39.388 WRITE DMA EXT > > Error 111 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 84 51 30 08 c4 90 06 Error: ICRC, ABRT 48 sectors at LBA = > 0x0690c408 = 110150664 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > 35 03 f0 48 c3 90 40 00 2d+03:59:29.925 WRITE DMA EXT > 35 03 f0 58 c2 90 40 00 2d+03:59:29.917 WRITE DMA EXT > 35 03 60 f8 c1 90 40 00 2d+03:59:29.910 WRITE DMA EXT > 35 03 b0 48 c1 90 40 00 2d+03:59:29.907 WRITE DMA EXT > 35 03 f0 58 c0 90 40 00 2d+03:59:29.899 WRITE DMA EXT > > Error 110 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 84 51 20 28 47 90 05 Error: ICRC, ABRT 32 sectors at LBA = > 0x05904728 = 93341480 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > 35 da f0 58 46 90 40 00 2d+03:49:28.919 WRITE DMA EXT > 35 da 20 38 46 90 40 00 2d+03:49:28.917 WRITE DMA EXT > 35 da f0 48 45 90 40 00 2d+03:49:28.909 WRITE DMA EXT > 35 da f0 58 44 90 40 00 2d+03:49:28.901 WRITE DMA EXT > 35 da 20 38 44 90 40 00 2d+03:49:28.899 WRITE DMA EXT > > Error 109 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 84 51 10 38 e7 41 05 Error: ICRC, ABRT 16 sectors at LBA = > 0x0541e738 = 88205112 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > 35 03 f0 58 e6 41 40 00 2d+03:46:20.939 WRITE DMA EXT > 35 03 20 38 e6 41 40 00 2d+03:46:20.933 WRITE DMA EXT > 35 03 f0 48 e5 41 40 00 2d+03:46:20.926 WRITE DMA EXT > 35 03 f0 58 e4 41 40 00 2d+03:46:20.921 WRITE DMA EXT > 35 03 20 38 e4 41 40 00 2d+03:46:20.916 WRITE DMA EXT > > Error 108 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 84 51 10 28 0a 2f 05 Error: ICRC, ABRT 16 sectors at LBA = > 0x052f0a28 = 86968872 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > 35 03 f0 48 09 2f 40 00 2d+03:45:06.250 WRITE DMA EXT > 35 03 f0 58 08 2f 40 00 2d+03:45:06.242 WRITE DMA EXT > 35 03 20 38 08 2f 40 00 2d+03:45:06.238 WRITE DMA EXT > 35 03 f0 48 07 2f 40 00 2d+03:45:06.231 WRITE DMA EXT > 35 03 08 40 bc 25 40 00 2d+03:45:06.225 WRITE DMA EXT > > SMART Self-test log structure revision number 1 > No self-tests have been logged. [To run self-tests, use: smartctl -t] > > > SMART Selective self-test log data structure revision number 1 > SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS > 1 0 0 Not_testing > 2 0 0 Not_testing > 3 0 0 Not_testing > 4 0 0 Not_testing > 5 0 0 Not_testing > Selective self-test flags (0x0): > After scanning selected spans, do NOT read-scan remainder of disk. > If Selective self-test is pending on power-up, resume after 0 minute delay. > ------------------------------------------------------------------------ > > These only show up in disk's SMART log, not as errors in other logs > (logs exists for all the history of the system). These do not seem to > have passed to the FreeBSD, or FreeBSD has ignored the errors. > > From logs, I can find da3, da4, da6 all have reported scsi timeout once > (different times, within one month in July. > > They all look like this, one case for each of the disks: > > ------------------------------------------------------------------------ > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 699 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 419 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 359 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 472 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 124 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 762 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 243 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 890 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 204 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 2 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on > device handle 0x000d SMID 291 > Jul 24 12:16:55 tila kernel: mps0: (0:6:0) terminated ioc 804b scsi 0 > state c xfer 0 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 699 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 419 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 419 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 359 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 359 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 472 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 472 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 124 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 124 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 762 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 762 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 243 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 243 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 890 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 890 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 204 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 204 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 2 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 2 complete > Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending > deferred task management request for handle 0x0d SMID 291 > Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request > on handle 0x0d SMID 291 complete > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): WRITE(16). CDB: 8a 0 0 0 > 0 1 4a 17 dc 5e 0 0 0 57 0 0 > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): CAM status: SCSI Status Error > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI status: Check Condition > Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI sense: UNIT > ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) > Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): WRITE(10). CDB: 2a 0 28 0 > 85 fb 0 0 1 0 > Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): CAM status: SCSI Status Error > Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): SCSI status: Check Condition > Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): SCSI sense: UNIT > ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) > ------------------------------------------------------------------------ > > If the da0 ICRC errors would have been seen by ZFS, it should have made > a) note of that in some log? b) retried write? c) Something else? If > we assume that the disk firmware is broken and does not report these to > OS, so da0 might be corrupt. But that should still be ok with raidz2. > > We do have 3 random SCSI timeouts, which were seen by FreeBSD, and thus > should have prompted ZFS do handle the errors, and one read error on > data, which is not reported as read error in any log, other than disk's > SMART info says so. > > Is it possible to get more information, which block of the file is > corrupt, on which disks the offending blocks reside, and similar > information? > > > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"