From owner-freebsd-stable@FreeBSD.ORG Mon Dec 17 22:07:28 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 326DCC4C; Mon, 17 Dec 2012 22:07:28 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 782CE8FC0C; Mon, 17 Dec 2012 22:07:27 +0000 (UTC) Received: from [192.168.43.26] (pyroxene.sentex.ca [199.212.134.18]) by smarthost1.sentex.ca (8.14.5/8.14.5) with ESMTP id qBHM7PJH044848; Mon, 17 Dec 2012 17:07:26 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <50CF97A5.6090806@sentex.net> Date: Mon, 17 Dec 2012 17:07:33 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: FreeBSD-STABLE Mailing List Subject: WRITE_FPDMA_QUEUED CAM status: ATA Status Error X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.72 on 64.7.153.18 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2012 22:07:28 -0000 Hi, Is there a way to tell / narrow down if an issue with errors like below are due to a bad cable or bad port multiplier ? The disks in a particular cage are throwing errors like these below. (RELENG9 from today) siis0@pci0:5:0:0: class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 hdr=0x00 vendor = 'Silicon Image, Inc.' device = 'SiI 3124 PCI-X Serial ATA Controller' class = mass storage subclass = RAID bar [10] = type Memory, range 64, base 0xb4408000, size 128, enabled bar [18] = type Memory, range 64, base 0xb4400000, size 32768, enabled bar [20] = type I/O Port, range 32, base 0x3000, size 16, enabled cap 01[64] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split transactions cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message siisch2: Error while READ LOG EXT (ada3:siisch2:0:3:0): WRITE_FPDMA_QUEUED. ACB: 61 56 af 71 0a 40 40 00 00 00 00 00 (ada3:siisch2:0:3:0): CAM status: ATA Status Error (ada3:siisch2:0:3:0): ATA status: 00 () (ada3:siisch2:0:3:0): RES: 00 00 00 00 00 00 00 00 00 00 00 (ada3:siisch2:0:3:0): Retrying command siisch2: Error while READ LOG EXT (ada2:siisch2:0:2:0): WRITE_FPDMA_QUEUED. ACB: 61 07 dc d8 0b 40 40 00 00 00 00 00 (ada2:siisch2:0:2:0): CAM status: ATA Status Error (ada2:siisch2:0:2:0): ATA status: 00 () (ada2:siisch2:0:2:0): RES: 00 00 00 00 00 00 00 00 00 00 00 (ada2:siisch2:0:2:0): Retrying command (ada2:siisch2:0:2:0): WRITE_FPDMA_QUEUED. ACB: 61 01 0c 1e 06 40 40 00 00 00 00 00 (ada2:siisch2:0:2:0): CAM status: ATA Status Error (ada2:siisch2:0:2:0): ATA status: 00 () (ada2:siisch2:0:2:0): RES: 00 00 00 00 00 00 00 00 00 00 00 (ada2:siisch2:0:2:0): Retrying command (ada2:siisch2:0:2:0): WRITE_FPDMA_QUEUED. ACB: 61 06 2d 88 00 40 40 00 00 00 00 00 (ada2:siisch2:0:2:0): CAM status: ATA Status Error (ada2:siisch2:0:2:0): ATA status: 00 () (ada2:siisch2:0:2:0): RES: 00 00 00 00 00 00 00 00 00 00 00 (ada2:siisch2:0:2:0): Retrying command # smartctl -x /dev/ada2 smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-PRERELEASE amd64] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Black Device Model: WDC WD2002FAEX-007BA0 Serial Number: WD-WMAY02759120 LU WWN Device Id: 5 0014ee 656c4b593 Firmware Version: 05.01D05 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Mon Dec 17 17:04:28 2012 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled ATA Security is: Disabled, NOT FROZEN [SEC1] === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (29280) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 298) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 1 3 Spin_Up_Time POS--K 253 253 021 - 8833 4 Start_Stop_Count -O--CK 100 100 000 - 15 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 093 092 000 - 5288 10 Spin_Retry_Count -O--CK 100 253 000 - 0 11 Calibration_Retry_Count -O--CK 100 253 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 10 192 Power-Off_Retract_Count -O--CK 200 200 000 - 8 193 Load_Cycle_Count -O--CK 200 200 000 - 6 194 Temperature_Celsius -O---K 104 096 000 - 48 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 22515 200 Multi_Zone_Error_Rate ---R-- 100 253 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning General Purpose Log Directory Version 1 SMART Log Directory Version 1 [multi-sector log support] GP/S Log at address 0x00 has 1 sectors [Log Directory] SMART Log at address 0x01 has 1 sectors [Summary SMART error log] SMART Log at address 0x02 has 5 sectors [Comprehensive SMART error log] GP Log at address 0x03 has 6 sectors [Ext. Comprehensive SMART error log] SMART Log at address 0x06 has 1 sectors [SMART self-test log] GP Log at address 0x07 has 1 sectors [Extended self-test log] SMART Log at address 0x09 has 1 sectors [Selective self-test log] GP Log at address 0x10 has 1 sectors [NCQ Command Error log] GP Log at address 0x11 has 1 sectors [SATA Phy Event Counters] GP/S Log at address 0x80 has 16 sectors [Host vendor specific log] GP/S Log at address 0x81 has 16 sectors [Host vendor specific log] GP/S Log at address 0x82 has 16 sectors [Host vendor specific log] GP/S Log at address 0x83 has 16 sectors [Host vendor specific log] GP/S Log at address 0x84 has 16 sectors [Host vendor specific log] GP/S Log at address 0x85 has 16 sectors [Host vendor specific log] GP/S Log at address 0x86 has 16 sectors [Host vendor specific log] GP/S Log at address 0x87 has 16 sectors [Host vendor specific log] GP/S Log at address 0x88 has 16 sectors [Host vendor specific log] GP/S Log at address 0x89 has 16 sectors [Host vendor specific log] GP/S Log at address 0x8a has 16 sectors [Host vendor specific log] GP/S Log at address 0x8b has 16 sectors [Host vendor specific log] GP/S Log at address 0x8c has 16 sectors [Host vendor specific log] GP/S Log at address 0x8d has 16 sectors [Host vendor specific log] GP/S Log at address 0x8e has 16 sectors [Host vendor specific log] GP/S Log at address 0x8f has 16 sectors [Host vendor specific log] GP/S Log at address 0x90 has 16 sectors [Host vendor specific log] GP/S Log at address 0x91 has 16 sectors [Host vendor specific log] GP/S Log at address 0x92 has 16 sectors [Host vendor specific log] GP/S Log at address 0x93 has 16 sectors [Host vendor specific log] GP/S Log at address 0x94 has 16 sectors [Host vendor specific log] GP/S Log at address 0x95 has 16 sectors [Host vendor specific log] GP/S Log at address 0x96 has 16 sectors [Host vendor specific log] GP/S Log at address 0x97 has 16 sectors [Host vendor specific log] GP/S Log at address 0x98 has 16 sectors [Host vendor specific log] GP/S Log at address 0x99 has 16 sectors [Host vendor specific log] GP/S Log at address 0x9a has 16 sectors [Host vendor specific log] GP/S Log at address 0x9b has 16 sectors [Host vendor specific log] GP/S Log at address 0x9c has 16 sectors [Host vendor specific log] GP/S Log at address 0x9d has 16 sectors [Host vendor specific log] GP/S Log at address 0x9e has 16 sectors [Host vendor specific log] GP/S Log at address 0x9f has 16 sectors [Host vendor specific log] GP/S Log at address 0xa0 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa1 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa2 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa3 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa4 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa5 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa6 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa7 has 16 sectors [Device vendor specific log] GP/S Log at address 0xa8 has 1 sectors [Device vendor specific log] GP/S Log at address 0xa9 has 1 sectors [Device vendor specific log] GP/S Log at address 0xaa has 1 sectors [Device vendor specific log] GP/S Log at address 0xab has 1 sectors [Device vendor specific log] GP/S Log at address 0xac has 1 sectors [Device vendor specific log] GP/S Log at address 0xad has 1 sectors [Device vendor specific log] GP/S Log at address 0xae has 1 sectors [Device vendor specific log] GP/S Log at address 0xaf has 1 sectors [Device vendor specific log] GP/S Log at address 0xb0 has 1 sectors [Device vendor specific log] GP/S Log at address 0xb1 has 1 sectors [Device vendor specific log] GP/S Log at address 0xb2 has 1 sectors [Device vendor specific log] GP/S Log at address 0xb3 has 1 sectors [Device vendor specific log] GP/S Log at address 0xb4 has 1 sectors [Device vendor specific log] GP/S Log at address 0xb5 has 1 sectors [Device vendor specific log] GP Log at address 0xb6 has 1 sectors [Device vendor specific log] GP/S Log at address 0xb7 has 1 sectors [Device vendor specific log] GP/S Log at address 0xbd has 1 sectors [Device vendor specific log] GP/S Log at address 0xc0 has 1 sectors [Device vendor specific log] GP Log at address 0xc1 has 24 sectors [Device vendor specific log] GP/S Log at address 0xe0 has 1 sectors [SCT Command/Status] GP/S Log at address 0xe1 has 1 sectors [SCT Data Transfer] SMART Extended Comprehensive Error Log Version: 1 (6 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. SCT Status Version: 3 SCT Version (vendor specific): 258 (0x0102) SCT Support Level: 1 Device State: Active (0) Current Temperature: 48 Celsius Power Cycle Min/Max Temperature: 42/48 Celsius Lifetime Min/Max Temperature: 42/56 Celsius Under/Over Temperature Limit Count: 0/0 SCT Temperature History Version: 2 Temperature Sampling Period: 1 minute Temperature Logging Interval: 1 minute Min/Max recommended Temperature: 0/60 Celsius Min/Max Temperature Limit: -41/85 Celsius Temperature History Size (Index): 478 (67) Index Estimated Time Temperature Celsius 68 2012-12-17 09:07 48 ***************************** ... ..(170 skipped). .. ***************************** 239 2012-12-17 11:58 48 ***************************** 240 2012-12-17 11:59 ? - 241 2012-12-17 12:00 42 *********************** 242 2012-12-17 12:01 42 *********************** 243 2012-12-17 12:02 43 ************************ 244 2012-12-17 12:03 44 ************************* ... ..( 6 skipped). .. ************************* 251 2012-12-17 12:10 44 ************************* 252 2012-12-17 12:11 45 ************************** ... ..( 9 skipped). .. ************************** 262 2012-12-17 12:21 45 ************************** 263 2012-12-17 12:22 46 *************************** ... ..( 15 skipped). .. *************************** 279 2012-12-17 12:38 46 *************************** 280 2012-12-17 12:39 47 **************************** ... ..( 30 skipped). .. **************************** 311 2012-12-17 13:10 47 **************************** 312 2012-12-17 13:11 48 ***************************** ... ..( 54 skipped). .. ***************************** 367 2012-12-17 14:06 48 ***************************** 368 2012-12-17 14:07 47 **************************** ... ..( 27 skipped). .. **************************** 396 2012-12-17 14:35 47 **************************** 397 2012-12-17 14:36 48 ***************************** ... ..(147 skipped). .. ***************************** 67 2012-12-17 17:04 48 ***************************** SCT Error Recovery Control command not supported Device Statistics (GP Log 0x04) not supported SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 2 1 Command failed due to ICRC error 0x0002 2 1 R_ERR response for data FIS 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 1 R_ERR response for host-to-device data FIS 0x0005 2 0 R_ERR response for non-data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x000a 2 0 Device-to-host register FISes sent due to a COMRESET 0x000b 2 1 CRC errors within host-to-device FIS 0x8000 4 7720 Vendor specific -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/