From owner-freebsd-hackers@freebsd.org Fri Jul 19 19:38:02 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2C57AAA183 for ; Fri, 19 Jul 2019 19:38:02 +0000 (UTC) (envelope-from rpokala@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C6DCF836F9; Fri, 19 Jul 2019 19:38:01 +0000 (UTC) (envelope-from rpokala@freebsd.org) Received: from [172.17.133.69] (unknown [12.202.168.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: rpokala) by smtp.freebsd.org (Postfix) with ESMTPSA id 40F3B10986; Fri, 19 Jul 2019 19:38:01 +0000 (UTC) (envelope-from rpokala@freebsd.org) User-Agent: Microsoft-MacOutlook/10.1b.0.190715 Date: Fri, 19 Jul 2019 12:37:57 -0700 Subject: Re: please help translate smartctl output to human language From: Ravi Pokala To: , "freebsd-hackers@freebsd.org" Message-ID: <3082DC9C-9D05-499F-A4FE-712338A32D14@freebsd.org> Thread-Topic: please help translate smartctl output to human language Mime-version: 1.0 Content-type: text/plain; charset="UTF-8" Content-transfer-encoding: 7bit X-Rspamd-Queue-Id: C6DCF836F9 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.97 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.97)[-0.970,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jul 2019 19:38:02 -0000 Hi Wojciech, > i am interested how much write-wear does my samsung SSD experienced relative to maximum allowed. > > on my 500GB samsung SSD smartctl says > > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE > 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 > 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 31126 > 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 59 > 177 Wear_Leveling_Count 0x0013 095 095 000 Pre-fail Always - 88 > 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0 > 181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0 > 182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0 > 183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0 > 187 Uncorrectable_Error_Cnt 0x0032 100 100 000 Old_age Always - 0 > 190 Airflow_Temperature_Cel 0x0032 073 051 000 Old_age Always - 27 > 195 ECC_Error_Rate 0x001a 200 200 000 Old_age Always - 0 > 199 CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0 > 235 POR_Recovery_Count 0x0012 099 099 000 Old_age Always - 28 > 241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 115175140988 > > All seems fine but i'm not sure if i correctly understand VALUE, WORST, THRESH data for Total_LBAs_Written For (S)ATA SMART in general, the way it works is that "VALUE" is a normalized representation, with higher values being better than lower values. Depending on the vendor, the starting value might be 253 (aka 0xff, minus a few reserved values), 200, or 100 (aka a percentage). Or, in the case of temperatures, the value of "VALUE" is usually (100 - current temperature); in the example above, that's (100 - 27) => 73. "WORST" is the lowest value of "VALUE" that the device has recorded. Some attributes are related to performance or short-term metrics, so the value of "VALUE" might increase and decrease over time; in that case, "WORST" is somewhat useful. Other attributes are related to usage and wear, so the value of "VALUE" will only ever decrease; in those cases, "WORST" is not very useful because it will always be the same as "VALUE". "THRESH" is the failure threshold for the attribute; *if* the attribute is marked "Pre-fail", and *if* the value of "VALUE" is lower than the value of "THRESH", *then* the overall SMART status will be reported as failed. In the data above, everything looks quite good; even the lowest values for "WORST" are above 90. (Except the temperature, which as described above is a little different; in this case, it looks like the highest temperature the device has seen is 49C, which isn't great, but isn't terrible.) > 50TB was written, so it's 100 times capacity. taking some write amplification in account (i use geli so no in drive compression have effect) it would be probably like 150-200. Nowadays, SSDs are usually rated in terms of "Device Writes per Day" (DWPD); for a device rated at 3DWPD with a 3-year warranty, the vendor is saying that it can handle writes equivalent to (3 * 3 * 365) = 3285 complete overwrites of the device. In the case of this 500GB device, that would be roughly 1.5PB of writes. Assuming this device uses 512B logical sectors, 115175140988 LBAs written would be ~54TB, which is ~3.3% of the total writes. > Value is 99. It was 100 when i bought it. > > Does it mean that in is 1% worn and can take 100 times more writes until it fails? or i am too optimistic? There are a few reasons why the calculated wear (~3.3%) and the reported wear (100% - 99%) might differ. For starters, it's not clear if that value is the number of LBAs written by the host, or the number LBAs written to the NAND; it's possible for a request to write a single block to trigger remapping and garbage collection, resulting in write amplification. Conversely, some drives might detect when a block is being zeroed out, and might simply put a flag on the LBA and mark the underlying NAND as obsolete and ready for erasure, resulting in write suppression. In any case, the bottom line I see here is that this device doesn't seem anywhere near wearout. -Ravi (rpokala@)