Date: Wed, 18 Nov 2009 02:51:55 +1100 (EST) From: Ian Smith <smithi@nimnet.asn.au> To: Roland Smith <rsmith@xs4all.nl> Cc: Bruce Cran <bruce@cran.org.uk>, freebsd-questions@freebsd.org, "Ronald F. Guilmette" <rfg@tristatelogic.com> Subject: Re: Bad Blocks... Should I RMA? Message-ID: <20091118014634.S65262@sola.nimnet.asn.au> In-Reply-To: <20091116231341.40E3F10656B0@hub.freebsd.org> References: <20091116231341.40E3F10656B0@hub.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
In freebsd-questions Digest, Vol 285, Issue 3, Message 28 On Mon, 16 Nov 2009 23:16:27 +0100 Roland Smith <rsmith@xs4all.nl> wrote: > On Mon, Nov 16, 2009 at 09:43:31PM +0000, Bruce Cran wrote: > > On Mon, 16 Nov 2009 19:23:58 +0100 > > Roland Smith <rsmith@xs4all.nl> wrote: > > > > > Install the smartmontools port, and check the drive with > > > 'smartctl -a /dev/ad4'. If you see a non-zero Reallocated_Sector_Ct, > > > RMA it immediately, as it is about to fail. If see other errors > > > reported, RMA it. > > > > > > (S)ATA disk have spare sectors available. If a sector fails, it is > > > replaced by one of the spares by the firmware. If you see a non-zero > > > Reallocated_Sector_Ct, it means that the drive has run out of spares. > > > This is bad news. > > > > Surely it's the other way around - if you see a value of zero in the > > "value" column the drive has run out of spare sectors and it's time to > > RMA the drive? > > I was talking about the _RAW_VALUE column. There seems to be some differences > in interpretation between vendors as to what the VALUE column means. Most of > the advice I've seen over the years says to look at the RAW_VALUE. > > See http://en.wikipedia.org/wiki/S.M.A.R.T. as well. Mmm, but as that article - which really only mentions the 'normalised' values smartctl presents in passing - points out, there can be quite a lot of variation between different manufacturers as to what RAW_VALUE actually represents for various attributes, whereas the usage of VALUE WORST THRESH values is much more consistent, and what the vendor is actually presenting as the SMART good/fair/fail analysis to the world. For instance, I've got two Fujitsu 5400rpm 2.5" drives in two laptops, one MHV2040AH with near 19,000 hours on it, and a much newer MHV2120AH, 40 and 120GB respectively. Nice quiet low-power laptop drives, fwiw. Both show as (more recently) being in the smartctl database, and both show _exactly_ the same values for this one: 5 Reallocated_Sector_Ct 0x0033 100 100 024 Pre-fail Always - 8589934592000 Now if that were a number of 512-byte sectors, it'd be 4096000 GB! :) but both drives are 100% ok, as the VALUE / WORST figures show. > > From what I've seen the 'raw' column appears to count > > the number of sectors the drive has remapped using the spares buffer. > > If it gets into the hundreds it's probably time to think about RMA'ing > > the drive > > Yes, the raw value is the number of sectors allocated from the spares. I > originally thought it was the number of reallocations _beyond_ the > spares. That's a misunderstanding on my part. Again, may depend on the drive make/model. With the same make/model you can of course usefully compare raw values, but be careful about drawing inferences for different drives, or you may be RMA'ing needlessly .. > Nevertheless this attribute (along with several) is marked on the Wikipedia > page for smart as a "Potential indicator of imminent electromechanical > failure". You can find the same attributes marked as critical when perusing > mailing list archives. > > For me, my data is worth much more than the harddisk it is on. Some of it is > literally irreplacable. So my policy is to go look for a replacement harddisk > as soon as the RAW_VALUEs of any of these critical indicators start going up > from zero. And store any data at least on two harddisks, whether in a mirror > or in a cron+rsync setup. That'd be the case for the disks you tend to use. I was first going to reply to Bruce's message when I spotted yours, but you've dropped the last bit of his quote, that I was about to wholeheartedly agree with :) : If it gets into the hundreds it's probably time to think about RMA'ing : the drive - if you trust that the 'raw' column is reporting what you : think it is (you should really only base your decision on the value, : worst and threshold columns). cheers, Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091118014634.S65262>