From owner-freebsd-hardware@freebsd.org Thu Apr 13 21:16:24 2017 Return-Path: Delivered-To: freebsd-hardware@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F87BD108B5 for ; Thu, 13 Apr 2017 21:16:24 +0000 (UTC) (envelope-from heas@shrubbery.net) Received: from guelah.shrubbery.net (guelah.shrubbery.net [198.58.5.1]) by mx1.freebsd.org (Postfix) with ESMTP id 42415D29 for ; Thu, 13 Apr 2017 21:16:23 +0000 (UTC) (envelope-from heas@shrubbery.net) Received: by guelah.shrubbery.net (Postfix, from userid 7053) id CB92D5B987; Thu, 13 Apr 2017 20:59:32 +0000 (UTC) Date: Thu, 13 Apr 2017 20:59:32 +0000 From: heasley To: freebsd-hardware@freebsd.org Cc: heasley Subject: SSD errors Message-ID: <20170413205932.GJ2149@shrubbery.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-PGPkey: http://www.shrubbery.net/~heas/public-key.asc X-note: live free, or die! X-homer: i just want to have a beer while i am caring. X-Claimation: an engineer needs a manager like a fish needs a bicycle X-reality: only YOU can put an end to the embarrassment that is Tom Cruise User-Agent: Mutt/1.8.0 (2017-02-23) X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Apr 2017 21:16:24 -0000 I have 4 SSDs in zfs raidz2 on 11.0-RELEASE-p2. There on QD sleds rated for sata6 to convert them from 2.5" to 3.5 slots in a Supermicro SC733TQ chassis (2012) with a Supermicro X10SRi-F mb and using the on-board controller (either one). I swapped the SATA cables for some with "extra shielding". And the 4 HDs used previously (mix of WD and seagate 750g and 2T) worked flawlessly. The bios has a "SATA Device Type" option for SSH/hd, which is set to SSD and seems to only apply to spin-up signals. The chassis manual makes no reference to SSDs, I have found no fbsd configuration or recommendations specific to SSDs. When I push a lot of data to them, such as an rsync, I receive errors like the below. If I move drives between slots, it seems to follow the chassis slots, those closest to the power supply, but I'm not positive about this. I suppose the questions for list are: - have I missed any fbsd ssd-specific configuration? - all 4 have non-zero UDMA_CRC_Error_Count counters; not many, about the same number, which I believe implies electrical interference - most likely in the cable or chassis backplane. Should I buy some specific model cable? other recommendations? tia (ada2:ahcich6:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 d0 c2 cf 40 06 00 00 00 00 00 (ada2:ahcich6:0:0:0): CAM status: Uncorrectable parity/CRC error (ada2:ahcich6:0:0:0): Retrying command (ada2:ahcich6:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 d8 c3 cf 40 06 00 00 00 00 00 (ada2:ahcich6:0:0:0): CAM status: Uncorrectable parity/CRC error (ada2:ahcich6:0:0:0): Retrying command (ada3:ahcich7:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 18 1d fb 40 03 00 00 00 00 00 (ada3:ahcich7:0:0:0): CAM status: Uncorrectable parity/CRC error (ada3:ahcich7:0:0:0): Retrying command (ada3:ahcich7:0:0:0): READ_FPDMA_QUEUED. ACB: 60 80 90 31 40 40 50 00 00 00 00 00 Device Model: Samsung SSD 850 EVO 2TB LU WWN Device Id: 5 002538 c4042fdb8 Firmware Version: EMT02B6Q User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Thu Apr 13 20:43:52 2017 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x53) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 265) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 2552 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 14 177 Wear_Leveling_Count 0x0013 100 100 000 Pre-fail Always - 0 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0 181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0 183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0032 074 062 000 Old_age Always - 26 195 Hardware_ECC_Recovered 0x001a 200 200 000 Old_age Always - 0 199 UDMA_CRC_Error_Count 0x003e 099 099 000 Old_age Always - 33 235 Unknown_Attribute 0x0012 099 099 000 Old_age Always - 2 241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 5911167739 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing 255 0 65535 Read_scanning was never started Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. From owner-freebsd-hardware@freebsd.org Sat Apr 15 17:34:22 2017 Return-Path: Delivered-To: freebsd-hardware@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7448FD3EF70 for ; Sat, 15 Apr 2017 17:34:22 +0000 (UTC) (envelope-from michael@fuckner.net) Received: from mo6-p00-ob.smtp.rzone.de (mo6-p00-ob.smtp.rzone.de [IPv6:2a01:238:20a:202:5300::8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.smtp.rzone.de", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 161A7ECA for ; Sat, 15 Apr 2017 17:34:21 +0000 (UTC) (envelope-from michael@fuckner.net) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1492277658; l=622; s=domk; d=fuckner.net; h=Content-Transfer-Encoding:Content-Language:Content-Type:In-Reply-To: MIME-Version:Date:From:References:To:Subject; bh=2xAo6FxKL0248gdpppfQYaqzzDiB+3zTrvVNtL6IHGY=; b=OlKqpyvSLwG8HdkbvfqeUDvhWbSxaWdTxKTVCGULb842QwDH2tB4Ega3NhNhzMlQiG daAATroyl9LKpphqBSVSo17pL7GQ/JPdowBeKPTdobV8b2e4jIy/e8AOL7Tk1orRM0kK RXrg8/61RL5bYr0mP86EJZCqj4QpO4VvwRhII= X-RZG-AUTH: :IWUHfUGtd9+6EujMWHx57N4dWae4bmTL/JIGbzkGUoozgknstV9BEzWRmW1bTYIlrvbDy724/sWPl0eoK9jqmuHMWWvbujAc X-RZG-CLASS-ID: mo00 Received: from [IPv6:2a02:2028:824:ed01:8dc:c79c:9872:920f] (some-ipv6-address.wtnet.de [IPv6:2a02:2028:824:ed01:8dc:c79c:9872:920f]) by smtp.strato.de (RZmta 40.6 AUTH) with ESMTPSA id K07217t3FHYGNqT (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (curve secp521r1 with 521 ECDH bits, eq. 15360 bits RSA)) (Client did not present a certificate) for ; Sat, 15 Apr 2017 19:34:16 +0200 (CEST) Subject: Re: SSD errors To: freebsd-hardware@freebsd.org References: <20170413205932.GJ2149@shrubbery.net> From: Michael Fuckner Message-ID: Date: Sat, 15 Apr 2017 19:34:17 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.0.1 MIME-Version: 1.0 In-Reply-To: <20170413205932.GJ2149@shrubbery.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Apr 2017 17:34:22 -0000 On 4/13/2017 10:59 PM, heasley wrote: > - all 4 have non-zero UDMA_CRC_Error_Count counters; not many, about the > same number, which I believe implies electrical interference - most > likely in the cable or chassis backplane. Should I buy some specific > model cable? other recommendations? I've seen similar issues with other consumer grade drives- seems their electrical interface is not built fo using it through a backplane. Perhaps there are better cables, perhaps a newer backplane, but in the end, I believe you should use enterprise drives when connecting them to a backplane. Regards, Michael!