Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Aug 2011 12:17:26 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Alex Samorukov <ml@os2.kiev.ua>
Cc:        freebsd-stable@freebsd.org, Dan Langille <dan@langille.org>
Subject:   Re: bad sector in gmirror HDD
Message-ID:  <20110820191726.GA39027@icarus.home.lan>
In-Reply-To: <4E50003D.30803@os2.kiev.ua>
References:  <1B4FC0D8-60E6-49DA-BC52-688052C4DA51@langille.org> <20110819232125.GA4965@icarus.home.lan> <B6B0AD0F-A74C-4F2C-88B0-101443D7831A@langille.org> <20110820032438.GA21925@icarus.home.lan> <4774BC00-F32B-4BF4-A955-3728F885CAA1@langille.org> <4E4FF4D6.1090305@os2.kiev.ua> <20110820183456.GA38317@icarus.home.lan> <4E50003D.30803@os2.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Aug 20, 2011 at 08:43:09PM +0200, Alex Samorukov wrote:
> 
> >"The SMART tests you did didn't really amount to anything; no surprise.
> >short and long tests usually do not test the surface of the disk.  There
> >are some drives which do it on a long test, but as I said before,
> >everything varies from drive to drive."
> >
> It is not correct statement, sorry. Long test trying to read all the
> data from surface (and doing some other things).
>
> // one of the smartmontools developers and sysutils/smartmontools
> maintainer.

That's great, but too bad it's generally not true in practise.  Dan's
long scan on his site proves it, and I've dealt with this situation
myself many times over.

SMART long tests *may* do a surface scan, but in most cases they just
seem to do something that's similar to "short" but over a longer period
of time.  Furthermore, some which *do* do a surface scan on a "long"
test don't always report LBA failures in the self-test log.  I've
personally seen this happen on Western Digital disks (model strings are
unknown, I'm certain I've rid myself of those drives).  Firmware
bug/quirk?  Possibly, but at the end of the day it doesn't matter -- it
means the end-user has wasted 2-3 hours for something that tests OK yet
we know for a fact isn't OK.

I *have* seen a drive do a surface scan on a "long" test and report LBAs
it couldn't read, but as I said, it's rare and varies from vendor to
vendor, drive to drive, and firmware to firmware.  When it happened I
was very, very surprised (and delighted).

The only thing I can trust 100% of the time when it comes to surface
scans is SMART selective scans (if available, which again the OP's drive
does not offer this), or using dd or a read-per-LBA on the OS level
(which works everywhere).

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110820191726.GA39027>