From owner-freebsd-stable@FreeBSD.ORG Wed Feb 8 21:00:59 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C7971065670 for ; Wed, 8 Feb 2012 21:00:59 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id CEEF28FC0A for ; Wed, 8 Feb 2012 21:00:58 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.5/8.14.4) with ESMTP id q18L0u6q005099 for ; Wed, 8 Feb 2012 16:00:56 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4F32E289.4080806@sentex.net> Date: Wed, 08 Feb 2012 16:00:57 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: FreeBSD-STABLE Mailing List X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.71 on IPv6:2607:f3e0:0:1::12 Subject: siisch1: Error while READ LOG EXT X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Feb 2012 21:00:59 -0000 I have a 4 port eSata PCIe card with 3 external port multipliers attached on an AMD64 box (8G of RAM), RELENG8 from Feb1st. siis0@pci0:5:0:0: class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)' device = 'PCI-X to Serial ATA Controller (SiI 3124)' class = mass storage subclass = RAID bar [10] = type Memory, range 64, base 0xb4408000, size 128, enabled bar [18] = type Memory, range 64, base 0xb4400000, size 32768, enabled bar [20] = type I/O Port, range 32, base 0x3000, size 16, enabled cap 01[64] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split transactions cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message siis0: port 0x3000-0x300f mem 0xb4408000-0xb440807f,0xb4400000-0xb4407fff irq 19 at device 0.0 on pci5 siis0: [ITHREAD] siisch0: at channel 0 on siis0 siisch0: [ITHREAD] siisch1: at channel 1 on siis0 siisch1: [ITHREAD] siisch2: at channel 2 on siis0 siisch2: [ITHREAD] siisch3: at channel 3 on siis0 siisch3: [ITHREAD] # camcontrol devlist at scbus0 target 0 lun 0 (pass0,ada0) at scbus0 target 1 lun 0 (pass1,ada1) at scbus0 target 2 lun 0 (pass2,ada2) at scbus0 target 3 lun 0 (pass3,ada3) at scbus0 target 15 lun 0 (pass4,pmp1) at scbus1 target 0 lun 0 (pass5,ada4) at scbus1 target 1 lun 0 (pass6,ada5) at scbus1 target 2 lun 0 (pass7,ada6) at scbus1 target 3 lun 0 (pass8,ada7) at scbus1 target 4 lun 0 (pass9,ada8) at scbus1 target 15 lun 0 (pass10,pmp0) at scbus4 target 0 lun 0 (pass11,da0) at scbus4 target 0 lun 1 (pass12,da1) at scbus4 target 16 lun 0 (pass13) at scbus5 target 0 lun 0 (pass14,da2) at scbus6 target 0 lun 0 (pass15,ada9) at scbus7 target 0 lun 0 (pass16,ada10) at scbus8 target 0 lun 0 (pass17,ada11) at scbus11 target 0 lun 0 (pass18,ada12) Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) along with a the odd slot timeout error. Feb 7 23:49:32 backup3 kernel: siisch1: ... waiting for slots 47000000 Feb 7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26 Feb 7 23:49:32 backup3 kernel: siisch1: siis_timeout is 07040000 ss 7f17e8b9 rs 7f17e8b9 es 00000000 sts 801d2000 serr 00680000 Feb 7 23:49:32 backup3 kernel: siisch1: ... waiting for slots 43000000 Feb 7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30 Feb 7 23:49:34 backup3 kernel: siisch1: siis_timeout is 07040000 ss 7f17e8b9 rs 7f17e8b9 es 00000000 sts 801d2000 serr 00680000 Feb 7 23:49:34 backup3 kernel: siisch1: ... waiting for slots 03000000 Feb 7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25 Feb 7 23:49:34 backup3 kernel: siisch1: siis_timeout is 07040000 ss 7f17e8b9 rs 7f17e8b9 es 00000000 sts 801d2000 serr 00680000 Feb 7 23:49:34 backup3 kernel: siisch1: ... waiting for slots 01000000 Feb 7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24 Feb 7 23:49:34 backup3 kernel: siisch1: siis_timeout is 07040000 ss 7f17e8b9 rs 7f17e8b9 es 00000000 sts 801d2000 serr 00680000 Feb 7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 01:33:52 backup3 last message repeated 2 times Feb 8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 01:50:31 backup3 last message repeated 2 times Feb 8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 04:04:05 backup3 kernel: siisch1: Error while READ LOG EXT smartctl doesnt show any issues on the drives other than one that has some historical errors from a while ago. What are these errors and do I need to worry about them ? The "READ LOG EXT" ones are new. This is the only drive with anything in its logs so not sure if this is causing the driver to complain smartctl -a /dev/ada9 smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-STABLE amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.11 Device Model: ST31000333AS Serial Number: 9TE14SRV LU WWN Device Id: 5 000c50 010a39664 Firmware Version: SD35 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Wed Feb 8 15:49:12 2012 EST ==> WARNING: There are known problems with these drives, see the following Seagate web pages: http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931 http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207951 http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207957 SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 617) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 203) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x103b) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 111 099 006 Pre-fail Always - 41201023 3 Spin_Up_Time 0x0003 093 092 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 68 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 2 7 Seek_Error_Rate 0x000f 088 060 030 Pre-fail Always - 791743293 9 Power_On_Hours 0x0032 075 075 000 Old_age Always - 22755 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 2 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 68 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 095 095 000 Old_age Always - 5 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 001 001 000 Old_age Always - 961 190 Airflow_Temperature_Cel 0x0022 065 056 045 Old_age Always - 35 (Min/Max 33/37) 194 Temperature_Celsius 0x0022 035 044 000 Old_age Always - 35 (0 25 0 0) 195 Hardware_ECC_Recovered 0x001a 049 030 000 Old_age Always - 41201023 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 ATA Error Count: 5 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 5 occurred at disk power-on lifetime: 18292 hours (762 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 1a ff ff ff 4f 00 11d+02:29:18.542 READ FPDMA QUEUED 60 00 1a ff ff ff 4f 00 11d+02:29:18.542 READ FPDMA QUEUED 60 00 1b ff ff ff 4f 00 11d+02:29:18.541 READ FPDMA QUEUED 60 00 19 ff ff ff 4f 00 11d+02:29:18.541 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:18.541 READ FPDMA QUEUED Error 4 occurred at disk power-on lifetime: 18292 hours (762 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 1a ff ff ff 4f 00 11d+02:29:15.783 READ FPDMA QUEUED 60 00 1a ff ff ff 4f 00 11d+02:29:15.780 READ FPDMA QUEUED 60 00 1b ff ff ff 4f 00 11d+02:29:15.732 READ FPDMA QUEUED 60 00 19 ff ff ff 4f 00 11d+02:29:15.732 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:15.731 READ FPDMA QUEUED Error 3 occurred at disk power-on lifetime: 18292 hours (762 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 1b ff ff ff 4f 00 11d+02:29:12.889 READ FPDMA QUEUED 60 00 19 ff ff ff 4f 00 11d+02:29:12.889 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:12.888 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:12.888 READ FPDMA QUEUED 60 00 1a ff ff ff 4f 00 11d+02:29:12.888 READ FPDMA QUEUED Error 2 occurred at disk power-on lifetime: 18292 hours (762 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 1b ff ff ff 4f 00 11d+02:29:10.011 READ FPDMA QUEUED 60 00 19 ff ff ff 4f 00 11d+02:29:10.011 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:10.010 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:10.010 READ FPDMA QUEUED 60 00 1a ff ff ff 4f 00 11d+02:29:10.010 READ FPDMA QUEUED Error 1 occurred at disk power-on lifetime: 18292 hours (762 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 1b ff ff ff 4f 00 11d+02:29:07.148 READ FPDMA QUEUED 60 00 19 ff ff ff 4f 00 11d+02:29:07.140 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:07.131 READ FPDMA QUEUED 60 00 1c ff ff ff 4f 00 11d+02:29:07.117 READ FPDMA QUEUED 60 00 35 ff ff ff 4f 00 11d+02:29:07.111 READ FPDMA QUEUED SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/