Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Mar 2018 13:37:55 -0400
From:      Charles Sprickman <spork@bway.net>
To:        lev@FreeBSD.org
Cc:        Chuck Tuffli <chuck@tuffli.net>, Rainer Duffner <rainer@ultra-secure.de>, Tom Evans via freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: smart(8) Call for Testing
Message-ID:  <21F62A27-17F2-4791-BFD5-99057D197E68@bway.net>
In-Reply-To: <be4d85ef-1bd4-d666-42cb-41ad1bc67dd8@FreeBSD.org>
References:  <4754cb2f-76bb-a69b-0cf5-eff4d621eb29@callfortesting.org> <CAMXt9NbdN119RrHnZHOJD1T%2BHNLLpzgkKVStyTm=49dopBMoAQ@mail.gmail.com> <CAM0tzX1oTWTa0Nes11yXg5x4c30MmxdUyT6M1_c4-PWv2%2BQbhw@mail.gmail.com> <CAMXt9NYMrtTNqNSx256mcYsPo48xnsa%2BCCYSoeFLzRsc%2BfQWMw@mail.gmail.com> <CAM0tzX32v2-=saT5iB4WVcsoVOtH%2BXE0OQoP7hEDB1xE%2Bxk%2Bsg@mail.gmail.com> <1d3f2cef-4c37-782e-7938-e0a2eebc8842@quip.cz> <A548BC90-815C-4C66-8E27-9A6F7480741D@bway.net> <7ED27465-1BC2-4522-873E-9ECE192EB7A2@ultra-secure.de> <e54ab9a7-835d-16c7-1fdd-9f8285c0642b@FreeBSD.org> <CAM0tzX3RanY=vZbCXTAHB3=kv6aVkuzO5pmwr9g%2BZQoe%2BN1hVg@mail.gmail.com> <be4d85ef-1bd4-d666-42cb-41ad1bc67dd8@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--=20
Charles Sprickman
NetEng/SysAdmin
Bway.net - New York's Best Internet www.bway.net
spork@bway.net - 212.982.9800



> On Mar 29, 2018, at 9:43 AM, Lev Serebryakov <lev@freebsd.org> wrote:
>=20
> On 29.03.2018 16:27, Chuck Tuffli wrote:
>=20
>>>> Maybe one of the vendors who sells FreeBSD as part of an appliance =
has shown some interest in this?
>>>>=20
>>>> If you=E2=80=99re hardware is well-defined and thus the output is =
consistent, I could imagine it=E2=80=99s not too difficult to parse =
this.
>>> smartd is very important part of smartmontools, smartctl is not so.
>>>=20
>>> And periodic self-test triggering & check is most important feature =
of
>>> smartd, IMHO.
>>>=20
>>> Modern HDDs are liers in SMART. And only regular self-test discover
>>> real errors on surfaces in my experience.
>>>=20
>>> So, tool without support for HDD self-tests is of little usage for
>>> appliances, IMHO.
>=20
>> Thank you for the feedback! As I don't have any experience with
>> smartd, can you help me better understand which parts of it are most
>> useful to you? Is it just  periodically triggering the self test or
>> are there other features as well? For example, logging the SMART
>> values, emailing / triggering notifications when certain criteria are
>> met, monitoring the self tests, reading the error logs, etc.?
>=20
>  Triggering of short and full self-tests and alerting (via e-mail) =
when
> test failed.
>=20
>  Monitoring of values and alerting is VERY important (number of
> Relocations is main indicator of spinning HDD health and when it =
raises
> it must be known ASAP), but it could be implemented with simple =
smart(8)
> utility and some scripting, so this is not problem.

It would be nice if we could grab these values via snmp=E2=80=A6  Right =
now I use
either nrpe or check_by_ssh in nagios to run scripts to parse smartctl =
output
and it would be weird to have SMART functions in base but not have that
tied to the stock snmpd. =20

> But all my dead HDDs were replaced on self-test fail =E2=80=94 it is =
what
> allows me to replace them BEFORE data were lost.

Yep, lots of folks claim the data is useless, but generally I see some =
signs of
failure before the drive dies, and sometimes those signs are spotted =
because
smartd is triggering regular self-tests.  And on SSDs, watching the MWI =
seems=20
to work very well - these drives are much smarter (no pun intended) than =
spinny=20
disks.

Charles

>=20
> --=20
> // Lev Serebryakov




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21F62A27-17F2-4791-BFD5-99057D197E68>