From owner-freebsd-stable@FreeBSD.ORG Fri Feb 10 21:35:56 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ADF08106566C; Fri, 10 Feb 2012 21:35:56 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 3DD3F8FC14; Fri, 10 Feb 2012 21:35:56 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.5/8.14.4) with ESMTP id q1ALZrPk085564; Fri, 10 Feb 2012 16:35:54 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4F358DB6.4030203@sentex.net> Date: Fri, 10 Feb 2012 16:35:50 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Jeremy Chadwick References: <4F32E289.4080806@sentex.net> <4F32F5B0.2060203@FreeBSD.org> <20120208223819.GA27488@icarus.home.lan> <4F32FB5E.7050102@FreeBSD.org> <4F33DB75.1080202@sentex.net> <20120209152240.GA95470@icarus.home.lan> <4F33F056.6070300@sentex.net> <20120209163415.GA96451@icarus.home.lan> <4F34124F.9090808@sentex.net> In-Reply-To: <4F34124F.9090808@sentex.net> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.71 on IPv6:2607:f3e0:0:1::12 Cc: Alexander Motin , freebsd-stable@freebsd.org Subject: Re: siisch1: Error while READ LOG EXT X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Feb 2012 21:35:56 -0000 On 2/9/2012 1:37 PM, Mike Tancsa wrote: > On 2/9/2012 11:34 AM, Jeremy Chadwick wrote: >> >> You will probably need to "track these drives" on a regular basis. That >> is to say, set up some cronjob or similar that logs the above output to >> a file (appends data to it), specifically output from smartctl -A (not >> -a and not -x) and smartctl -l sataphy on a per-disk basis. smartd can >> track SMART attribute changes, but does not track GPLog changes. Make >> sure to put timestamps in your logs. > > Thanks very much for having a look, and the suggestions. It think this > is the way to go to see which drive my have errors incrementing. > Alexander, is there a better way you can suggest ? Got a few more of the READ LOG EXT errors and I did a snapshot of all the disks post error to compare with the snapshots from cron this AM. Unfortunately some of the deltas were on the one new port multiplier and some were on the motherboard sata. Feb 9 04:34:55 backup3 kernel: siisch1: Error while READ LOG EXT Feb 10 16:05:53 backup3 kernel: siisch1: Error while READ LOG EXT Feb 10 16:06:53 backup3 kernel: siisch1: Error while READ LOG EXT Feb 10 16:07:06 backup3 last message repeated 3 times Feb 10 16:18:24 backup3 last message repeated 16 times Feb 10 16:18:24 backup3 kernel: Feb 10 16:18:39 backup3 kernel: siisch1: Error while READ LOG EXT Feb 10 16:19:10 backup3 kernel: siisch1: Error while READ LOG EXT Feb 10 16:20:27 backup3 last message repeated 4 times Feb 10 16:20:27 backup3 kernel: Feb 10 16:20:30 backup3 kernel: siisch1: Error while READ LOG EXT Feb 10 16:21:33 backup3 kernel: siisch1: Error while READ LOG EXT Feb 10 16:23:23 backup3 last message repeated 8 times On ada4, -199 UDMA_CRC_Error_Count -O--CK 200 199 000 - 13 +199 UDMA_CRC_Error_Count -O--CK 200 199 000 - 32 SATA Phy Event Counters (GP Log 0x11) ID Size Value Description -0x0001 2 13 Command failed due to ICRC error -0x0002 2 13 R_ERR response for data FIS -0x0003 2 13 R_ERR response for device-to-host data FIS +0x0001 2 32 Command failed due to ICRC error +0x0002 2 32 R_ERR response for data FIS +0x0003 2 32 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS -0x0005 2 0 R_ERR response for non-data FIS -0x0006 2 0 R_ERR response for device-to-host non-data FIS +0x0005 2 1 R_ERR response for non-data FIS +0x0006 2 1 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x000a 2 0 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS -0x8000 4 744462 Vendor specific +0x8000 4 785195 Vendor specific General Purpose Log 0x10 [NCQ Command Error log], Page 0-0 (of 1) -0000000: 05 00 41 84 04 9a 53 40 00 00 00 00 00 00 00 00 |..A...S@........| +0000000: 06 00 41 84 f2 39 6d 40 2d 00 00 00 00 00 00 00 |..A..9m@-.......| -00001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa |................| +00001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 25 |...............%| ada5 -199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 11 +199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 22 -0x0001 2 11 Command failed due to ICRC error -0x0002 2 11 R_ERR response for data FIS -0x0003 2 11 R_ERR response for device-to-host data FIS +0x0001 2 22 Command failed due to ICRC error +0x0002 2 22 R_ERR response for data FIS +0x0003 2 22 R_ERR response for device-to-host data FIS ada6 -199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 8 +199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 25 SATA Phy Event Counters (GP Log 0x11) ID Size Value Description -0x0001 2 8 Command failed due to ICRC error -0x0002 2 8 R_ERR response for data FIS -0x0003 2 8 R_ERR response for device-to-host data FIS +0x0001 2 25 Command failed due to ICRC error +0x0002 2 25 R_ERR response for data FIS +0x0003 2 25 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0005 2 0 R_ERR response for non-data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x000a 2 0 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS -0x8000 4 744462 Vendor specific +0x8000 4 785195 Vendor specific ada7 -199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 13 +199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 30 SATA Phy Event Counters (GP Log 0x11) ID Size Value Description -0x0001 2 13 Command failed due to ICRC error -0x0002 2 13 R_ERR response for data FIS -0x0003 2 13 R_ERR response for device-to-host data FIS +0x0001 2 30 Command failed due to ICRC error +0x0002 2 31 R_ERR response for data FIS +0x0003 2 31 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0005 2 1 R_ERR response for non-data FIS 0x0006 2 1 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x000a 2 0 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS -0x8000 4 744460 Vendor specific +0x8000 4 785193 Vendor specific General Purpose Log 0x10 [NCQ Command Error log], Page 0-0 (of 1) -0000000: 19 00 41 84 74 3d 4a 40 29 00 00 00 00 00 00 00 |..A.t=J@).......| +0000000: 15 00 41 84 d7 03 1f 40 2d 00 00 00 00 00 00 00 |..A....@-.......| 0000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| @@ -238,5 +244,5 @@ 00001c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00001d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00001e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| -00001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b3 |................| +00001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b5 |................| ada9 ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE - 1 Raw_Read_Error_Rate POSR-- 115 099 006 - 91821743 + 1 Raw_Read_Error_Rate POSR-- 117 099 006 - 155365055 3 Spin_Up_Time PO---- 093 092 000 - 0 4 Start_Stop_Count -O--CK 100 100 020 - 68 5 Reallocated_Sector_Ct PO--CK 100 100 036 - 2 - 7 Seek_Error_Rate POSR-- 088 060 030 - 792342525 - 9 Power_On_Hours -O--CK 074 074 000 - 22792 + 7 Seek_Error_Rate POSR-- 088 060 030 - 792482445 + 9 Power_On_Hours -O--CK 074 074 000 - 22803 10 Spin_Retry_Count PO--C- 100 100 097 - 2 12 Power_Cycle_Count -O--CK 100 100 020 - 68 184 End-to-End_Error -O--CK 100 100 099 - 0 187 Reported_Uncorrect -O--CK 095 095 000 - 5 188 Command_Timeout -O--CK 100 100 000 - 0 189 High_Fly_Writes -O-RCK 001 001 000 - 961 -190 Airflow_Temperature_Cel -O---K 064 056 045 - 36 (Min/Max 33/37) -194 Temperature_Celsius -O---K 036 044 000 - 36 (0 25 0 0 0) -195 Hardware_ECC_Recovered -O-RC- 050 030 000 - 91821743 +190 Airflow_Temperature_Cel -O---K 066 056 045 - 34 (Min/Max 33/37) +194 Temperature_Celsius -O---K 034 044 000 - 34 (0 25 0 0 0) +195 Hardware_ECC_Recovered -O-RC- 050 030 000 - 155365055 197 Current_Pending_Sector -O--C- 100 100 000 - 0 198 Offline_Uncorrectable ----C- 100 100 000 - 0 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0 ada10 SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE - 1 Raw_Read_Error_Rate POSR-- 118 099 006 - 196445860 + 1 Raw_Read_Error_Rate POSR-- 107 099 006 - 13128068 3 Spin_Up_Time PO---- 095 095 000 - 0 4 Start_Stop_Count -O--CK 100 100 020 - 216 5 Reallocated_Sector_Ct PO--CK 100 100 036 - 0 - 7 Seek_Error_Rate POSR-- 087 060 030 - 586360650 - 9 Power_On_Hours -O--CK 077 077 000 - 20319 + 7 Seek_Error_Rate POSR-- 087 060 030 - 586495516 + 9 Power_On_Hours -O--CK 077 077 000 - 20330 10 Spin_Retry_Count PO--C- 100 100 097 - 0 12 Power_Cycle_Count -O--CK 100 100 020 - 113 183 Runtime_Bad_Block -O--CK 100 100 000 - 0 @@ -69,15 +69,15 @@ 187 Reported_Uncorrect -O--CK 100 100 000 - 0 188 Command_Timeout -O--CK 100 100 000 - 0 189 High_Fly_Writes -O-RCK 099 099 000 - 1 -190 Airflow_Temperature_Cel -O---K 067 062 045 - 33 (Min/Max 31/34) -194 Temperature_Celsius -O---K 033 040 000 - 33 (0 22 0 0 0) -195 Hardware_ECC_Recovered -O-RC- 040 018 000 - 196445860 +190 Airflow_Temperature_Cel -O---K 068 062 045 - 32 (Min/Max 31/34) +194 Temperature_Celsius -O---K 032 040 000 - 32 (0 22 0 0 0) +195 Hardware_ECC_Recovered -O-RC- 028 018 000 - 13128068 197 Current_Pending_Sector -O--C- 100 100 000 - 0 198 Offline_Uncorrectable ----C- 100 100 000 - 0 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0 -240 Head_Flying_Hours ------ 100 253 000 - 205935091929084 -241 Total_LBAs_Written ------ 100 253 000 - 1286405353 -242 Total_LBAs_Read ------ 100 253 000 - 708601879 +240 Head_Flying_Hours ------ 100 253 000 - 221530118180872 +241 Total_LBAs_Written ------ 100 253 000 - 3323838357 +242 Total_LBAs_Read ------ 100 253 000 - 1778396343 ada11 ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE - 1 Raw_Read_Error_Rate POSR-- 120 097 006 - 242285977 + 1 Raw_Read_Error_Rate POSR-- 113 097 006 - 58229866 3 Spin_Up_Time PO---- 092 091 000 - 0 4 Start_Stop_Count -O--CK 100 100 020 - 69 5 Reallocated_Sector_Ct PO--CK 100 100 036 - 0 - 7 Seek_Error_Rate POSR-- 073 060 030 - 133894632808 - 9 Power_On_Hours -O--CK 072 072 000 - 25283 + 7 Seek_Error_Rate POSR-- 073 060 030 - 133894764364 + 9 Power_On_Hours -O--CK 072 072 000 - 25294 10 Spin_Retry_Count PO--C- 100 100 097 - 3 12 Power_Cycle_Count -O--CK 100 100 020 - 82 184 End-to-End_Error -O--CK 100 100 099 - 0 187 Reported_Uncorrect -O--CK 100 100 000 - 0 188 Command_Timeout -O--CK 100 089 000 - 124555952157 189 High_Fly_Writes -O-RCK 080 080 000 - 20 -190 Airflow_Temperature_Cel -O---K 059 050 045 - 41 (Min/Max 38/42) -194 Temperature_Celsius -O---K 041 050 000 - 41 (0 22 0 0 0) -195 Hardware_ECC_Recovered -O-RC- 051 032 000 - 242285977 +190 Airflow_Temperature_Cel -O---K 061 050 045 - 39 (Min/Max 38/42) +194 Temperature_Celsius -O---K 039 050 000 - 39 (0 22 0 0 0) +195 Hardware_ECC_Recovered -O-RC- 050 032 000 - 58229866 197 Current_Pending_Sector -O--C- 100 100 000 - 0 198 Offline_Uncorrectable ----C- 100 100 000 - 0 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0 -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/