Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Jul 2018 11:23:01 -0600
From:      Alan Somers <asomers@freebsd.org>
To:        Wojciech Puchar <wojtek@puchar.net>
Cc:        Stefan Blachmann <sblachmann@gmail.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org>,  George Mitchell <george+freebsd@m5p.com>, Lev Serebryakov <lev@freebsd.org>
Subject:   Re: Confusing smartd messages
Message-ID:  <CAOtMX2goxJkv1CFAcoFsw0NxaYvmLDXV8CxWr2DuQ%2BD56w2vuw@mail.gmail.com>
In-Reply-To: <alpine.BSF.2.20.1807051859250.34332@puchar.net>
References:  <dfccd275-954c-11da-1790-e75878f89ad1@m5p.com> <51eb8232-49a7-0b3a-2d0f-9882ebfbfa1d@FreeBSD.org> <alpine.BSF.2.20.1807051642090.17082@puchar.net> <CACc-My36jbL=WWpxOB24D_YLDMofSHAk9JgrP86LKd4MEct1mg@mail.gmail.com> <CAOtMX2gG48jzWkPg3kGpSVDC89KY14ta3p-U%2BO5yExHZJfNL7w@mail.gmail.com> <alpine.BSF.2.20.1807051859250.34332@puchar.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 5, 2018 at 11:03 AM, Wojciech Puchar <wojtek@puchar.net> wrote:

>
>> Rewriting suspicious sectors is useless in this day and age.  HDDs and
>> SSDs
>> already do it internally and have for years.  Even healthy sectors get
>>
>
> unreadable sectors cannot be rewritten by drive electronics as it doesn't
> know what to rewrite. it may possibly remap it but still report read error
> until some data will be written - unless giving no error and returning
> meaningless data is an accepted behaviour.
>

But if that disk is already managed by ZFS, the pool is redundant, and the
bad sector is allocated by ZFS, then ZFS will immediately rewrite the
unreadable sector.


>
> only on write it can be done properly.
>
> that the HDD/SSD won't fix itself would be a checksum error.  Those are
>>
>
> yes and this will happen if you powerdown your disk on write. or get some
> power spike or other source of noise that would affect electronic
> components.
>

It happens surprisingly rarely.  Even on a sudden power loss, the drive is
usually able to finish its current write operation.  When you run into
problems would be if the power loss were coincident with a mechanical shock
that knocks the head off-track, or something like that.


>
> performing full disk rewrite (so not zfs rebuilds) and THEN looking at
> smart stats and THEN performing regular smartctl -t long will tell the
> truth.
>
> which usually is "drive is fine" in my practice. really faulty drive will
> QUICKLY develop new problems.
>

Yeah, that should make the error go away.  It takes a long time, though.
With a SCSI drive, you can get the exact LBAs affected with a "READ
DEFECTS" command.  But there isn't a vendor-independent equivalent for
SATA, unfortunately.

-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2goxJkv1CFAcoFsw0NxaYvmLDXV8CxWr2DuQ%2BD56w2vuw>