From owner-freebsd-hackers@freebsd.org  Thu Jul  5 16:39:49 2018
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8683B103BAA7
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Thu,  5 Jul 2018 16:39:49 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com
 [IPv6:2a00:1450:4864:20::229])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id CACFC8737D;
 Thu,  5 Jul 2018 16:39:48 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: by mail-lj1-x229.google.com with SMTP id 1-v6so7123847ljv.9;
 Thu, 05 Jul 2018 09:39:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=uNF0ZV1vdtGn8VZs/c5Xp/ShdrKC5/kJoWnZ9H3capY=;
 b=ZManTTdUP1XtRKc5Zxik400QcV5DOml+7RY1TZMNjZTFwWbS0oDf1xns/TPCb5aKKV
 Auu9BN55zYwghwzcupCuOhbHA9yiuusPo5GBpNbQNpYLD2umk9sSLlghag72oMn0sGmk
 JkHY7NXm3KUntAxB35WYfDm8kHHrdwAUfd/xSAGxnghQU+8VX4F6EfXnw57KFEUvb+6+
 SrhahWb8XaY/HPRZwZo81F4M4xVhCMaSuKBx7QEP7qFJREk7U9zh641McQ2GWtIvvTo8
 Gv7NG6+rMy9jKlRQ6m+Yubf2TkJq5i/4BlahgGtPDFlo2ZRq7KztxS1NujAySzhAY8o6
 Tl4w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
 :date:message-id:subject:to:cc;
 bh=uNF0ZV1vdtGn8VZs/c5Xp/ShdrKC5/kJoWnZ9H3capY=;
 b=VJHBSQOWMTrMk5AUKziVaAwmdYwv0gFRzi+vrWtwHf1H33Us8V34Q6dwtV1RiNGdtr
 gpzSWcc9TZw1oYRQzGU0MdQACdvb10BETrV+/mwEg65qW/ZkOIuxx1N09nZ9DUjNXDD3
 QHKFAqG+Uic9PmltR7S70cqJAlVRSnfgKVkC+f/krtCMhSqLJPc0StuD4X4Ym13d9qEJ
 GPytMUZyvb7NoN2HMGotuBzW08BG1MYRlFQRt4JaD0rV7SvSFIX6YPVmZN3tbr1plZnG
 0qEYNw34m4KZmDQCdgb63kHj7cS6C7TOrFIN7G/lA8BkL3SsRh1WZizL3X74KS1YHak3
 j0Aw==
X-Gm-Message-State: APt69E1PSXbjUPQgEBLh5tEKth3qiWpqnOYtVSw3UhvzjIVGaZpGKROi
 UIgJAwJBJT3bcXvRUIfYAWCaSXYA0mSoW9pZ44w=
X-Google-Smtp-Source: AAOMgpcaRps6XYQIKOhbAxZfyJoZ2CvJ+z4PEY5MVsg1OZ7H1HpnemeyC/N5+yvEhAUQ1TUBZKQ60van7p9GsXvl+Zo=
X-Received: by 2002:a2e:3101:: with SMTP id x1-v6mr4628752ljx.8.1530808786267; 
 Thu, 05 Jul 2018 09:39:46 -0700 (PDT)
MIME-Version: 1.0
Sender: asomers@gmail.com
Received: by 2002:ab3:1b91:0:0:0:0:0 with HTTP;
 Thu, 5 Jul 2018 09:39:45 -0700 (PDT)
In-Reply-To: <CACc-My36jbL=WWpxOB24D_YLDMofSHAk9JgrP86LKd4MEct1mg@mail.gmail.com>
References: <dfccd275-954c-11da-1790-e75878f89ad1@m5p.com>
 <51eb8232-49a7-0b3a-2d0f-9882ebfbfa1d@FreeBSD.org>
 <alpine.BSF.2.20.1807051642090.17082@puchar.net>
 <CACc-My36jbL=WWpxOB24D_YLDMofSHAk9JgrP86LKd4MEct1mg@mail.gmail.com>
From: Alan Somers <asomers@freebsd.org>
Date: Thu, 5 Jul 2018 10:39:45 -0600
X-Google-Sender-Auth: 3mAMel_QA4gxlLQ0g7svzEkKZkc
Message-ID: <CAOtMX2gG48jzWkPg3kGpSVDC89KY14ta3p-U+O5yExHZJfNL7w@mail.gmail.com>
Subject: Re: Confusing smartd messages
To: Stefan Blachmann <sblachmann@gmail.com>
Cc: Wojciech Puchar <wojtek@puchar.net>,
 FreeBSD Hackers <freebsd-hackers@freebsd.org>, 
 George Mitchell <george+freebsd@m5p.com>, Lev Serebryakov <lev@freebsd.org>
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.27
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Jul 2018 16:39:49 -0000

My advice to the OP is to chill out.  SMART is inconsistently implemented
by different drive vendors and it's very hard to interpret its output.  I
would only recommend replacing a drive based on its SMART status for two
reasons:

1) The drive is under warranty and the vendor agrees to a free replacement
based on the SMART output alone.  The vendors know the meaning of their own
SMART fields better than you do.

2) A large statistical dataset shows that this particular SMART field is
correlated with early failure, for your model of hard drive (or at least a
similar model).  Backblaze maintains one such dataset, which they
periodically publish on their blog.  There are a few other outdated
datasets in the academic literature.  One from AOL, and several from
supercomputer operators.  But Backblaze's is the best because a) it's
current, b) it's large, and b) they have a very diverse set of hard
drives.  Still, even Backblaze can sound a little superstitious (they
replace an entire chassis once several of its drives have had SMART
problems).

https://www.backblaze.com/blog/hard-drive-reliability-q1-2015/

If the drive is not RMAable and you're nervous because you love your data,
then you might consider setting up a hotspare.  zfsd(8) will activate it
the moment that one of your current drives fails.  You can even configure
the hotspare to be spun down most of the time so it won't be affected by
the mechanical shocks or regular wear that the live drives endure.

Rewriting suspicious sectors is useless in this day and age.  HDDs and SSDs
already do it internally and have for years.  Even healthy sectors get
rewritten every now and then due to the adjacent track interference
problem.  About the only kind of problem that could develop on the track
that the HDD/SSD won't fix itself would be a checksum error.  Those are
very rare, and ZFS will fix them immediately.

-Alan "too well versed in hard drive reliability for my own good" Somers

On Thu, Jul 5, 2018 at 10:11 AM, Stefan Blachmann <sblachmann@gmail.com>
wrote:

> Another problem issue is that flash memories also exhibit the charge
> drain problem.
> They cannot be read indefinitely without occasional rewrite, as every
> read drains a minuscule amount of the charge.
>
> I often wished I knew of some OS/driver function/mechanism which can
> rewrite respective refresh media on a mounted+running system and could
> be, for example, run via cron.
>
> Such would not only be very useful to fix pending sectors without
> stopping a running machine, but also for keeping embedded machines'
> flash memories reliably charged over the years.
>
>
>
> On 7/5/18, Wojciech Puchar <wojtek@puchar.net> wrote:
> >>> okay.  What's the recommended action at this point?     -- George
> >>
> >> In my experience it is begin of disk death, even if overall status is
> >> PASSED. It could work for month or may be half a year after first
> >> Offline_Uncorrectable is detected (it depends on load), but you best bet
> >> to replace it ASAP and throw away.
> > well my disk had this and live happily for 3 years.
> >
> > It JUST means that some sectors are unreadable which may be a reason that
> > at some some write got wrong because of hardware problem. But this
> problem
> > may be - and possibly were - powerdown while writing, or power spike.
> >
> > the media itself could be fine. the best action in such case is to force
> > rewrite whole drive with some data.
> >
> > with gmirror it is as easy as first checking second drive for no errors,
> > then forcing remirror.
> > _______________________________________________
> > freebsd-hackers@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@
> freebsd.org"
> >
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>