From owner-freebsd-fs@freebsd.org Thu Mar 29 17:38:02 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A2DC0F66394 for ; Thu, 29 Mar 2018 17:38:02 +0000 (UTC) (envelope-from spork@bway.net) Received: from smtp1.bway.net (smtp1.bway.net [216.220.96.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 50D207A422; Thu, 29 Mar 2018 17:38:02 +0000 (UTC) (envelope-from spork@bway.net) Received: from frankentosh.sporklab.com (pool-71-187-162-242.nwrknj.fios.verizon.net [71.187.162.242]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: spork@bway.net) by smtp1.bway.net (Postfix) with ESMTPSA id DEAC395855; Thu, 29 Mar 2018 13:37:55 -0400 (EDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: smart(8) Call for Testing From: Charles Sprickman In-Reply-To: Date: Thu, 29 Mar 2018 13:37:55 -0400 Cc: Chuck Tuffli , Rainer Duffner , Tom Evans via freebsd-fs Content-Transfer-Encoding: quoted-printable Message-Id: <21F62A27-17F2-4791-BFD5-99057D197E68@bway.net> References: <4754cb2f-76bb-a69b-0cf5-eff4d621eb29@callfortesting.org> <1d3f2cef-4c37-782e-7938-e0a2eebc8842@quip.cz> <7ED27465-1BC2-4522-873E-9ECE192EB7A2@ultra-secure.de> To: lev@FreeBSD.org X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Mar 2018 17:38:02 -0000 --=20 Charles Sprickman NetEng/SysAdmin Bway.net - New York's Best Internet www.bway.net spork@bway.net - 212.982.9800 > On Mar 29, 2018, at 9:43 AM, Lev Serebryakov wrote: >=20 > On 29.03.2018 16:27, Chuck Tuffli wrote: >=20 >>>> Maybe one of the vendors who sells FreeBSD as part of an appliance = has shown some interest in this? >>>>=20 >>>> If you=E2=80=99re hardware is well-defined and thus the output is = consistent, I could imagine it=E2=80=99s not too difficult to parse = this. >>> smartd is very important part of smartmontools, smartctl is not so. >>>=20 >>> And periodic self-test triggering & check is most important feature = of >>> smartd, IMHO. >>>=20 >>> Modern HDDs are liers in SMART. And only regular self-test discover >>> real errors on surfaces in my experience. >>>=20 >>> So, tool without support for HDD self-tests is of little usage for >>> appliances, IMHO. >=20 >> Thank you for the feedback! As I don't have any experience with >> smartd, can you help me better understand which parts of it are most >> useful to you? Is it just periodically triggering the self test or >> are there other features as well? For example, logging the SMART >> values, emailing / triggering notifications when certain criteria are >> met, monitoring the self tests, reading the error logs, etc.? >=20 > Triggering of short and full self-tests and alerting (via e-mail) = when > test failed. >=20 > Monitoring of values and alerting is VERY important (number of > Relocations is main indicator of spinning HDD health and when it = raises > it must be known ASAP), but it could be implemented with simple = smart(8) > utility and some scripting, so this is not problem. It would be nice if we could grab these values via snmp=E2=80=A6 Right = now I use either nrpe or check_by_ssh in nagios to run scripts to parse smartctl = output and it would be weird to have SMART functions in base but not have that tied to the stock snmpd. =20 > But all my dead HDDs were replaced on self-test fail =E2=80=94 it is = what > allows me to replace them BEFORE data were lost. Yep, lots of folks claim the data is useless, but generally I see some = signs of failure before the drive dies, and sometimes those signs are spotted = because smartd is triggering regular self-tests. And on SSDs, watching the MWI = seems=20 to work very well - these drives are much smarter (no pun intended) than = spinny=20 disks. Charles >=20 > --=20 > // Lev Serebryakov