Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Dec 2017 20:39:08 +0100
From:      "O. Hartmann" <o.hartmann@walstatt.org>
To:        "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>
Cc:        Cy Schubert <Cy.Schubert@komquats.com>, "O. Hartmann" <ohartmann@walstatt.org>, FreeBSD CURRENT <freebsd-current@freebsd.org>, Freddie Cash <fjwcash@gmail.com>, Alan Somers <asomers@freebsd.org>
Subject:   Re: SMART: disk problems on RAIDZ1 pool: (ada6:ahcich6:0:0:0): CAMstatus: ATA Status Error
Message-ID:  <20171213203935.270e5f65@thor.intern.walstatt.dynvpn.de>
In-Reply-To: <201712131647.vBDGlrf2092528@pdx.rh.CN85.dnsmgr.net>
References:  <20171213161116.1889f178@hermann> <201712131647.vBDGlrf2092528@pdx.rh.CN85.dnsmgr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/yQe_re/+qDm74aywZGbTQDl
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Am Wed, 13 Dec 2017 08:47:53 -0800 (PST)
"Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net> schrieb:

> > On Tue, 12 Dec 2017 14:58:28 -0800
> > Cy Schubert <Cy.Schubert@komquats.com> wrote:
> >  =20
> > > There are a couple of ways you can address this. You'll need to
> > > offline the vdev first. If you've done a smartcrl -t long and if the
> > > test failed, smartcrl -a will tell you which block it had an issue
> > > with. You can use dd, ddrescue or dd_rescue to dd the block over
> > > itself. The drive may rewrite the (weak) block or if it fails to it
> > > will remap it (subsequently showing as reallocated).
> > >=20
> > > Of course there is a risk. If the sector is any of the boot blocks
> > > there is a good chance the server will hang. =20
> >=20
> > The drive is part of a dedicated storage-only pool. The boot drive is a
> > fast SSD. So I do not care about this - well, to say it more politely:
> > I do not have to take care of that aspect.
> >  =20
> > >=20
> > > You have to be *absolutely* sure which the bad sector is. And, there
> > > may be more. There is a risk of data loss.
> > >=20
> > > I've used this technique many times. Most times it works perfectly.
> > > Other times the affected file is lost but the rest of the file system
> > > is recovered. And again there is always the risk.
> > >=20
> > > Replace the disk immediately if you experience a growing succession
> > > of pending sectors. Otherwise replace the disk at your earliest
> > > convenience. =20
> >=20
> > The ZFS scrubbing of the volume ended this morning, leaving the pool in
> > a healthy state. After reboot, there was no sign of CAM errors again.
> >=20
> > But there is something else I'm worried about. The mainboard I use is a=
=20
> >=20
> > ASRock Z77 Pro4-M.
> > The board has a cripple Intel MCP with 6 SATA ports from the chipset,
> > two of them SATA 6GB, 4 SATA II, and one additional chip with two SATA
> > 6GB ports:
> >=20
> > [...]
> > ahci0@pci0:2:0:0:       class=3D0x010601 card=3D0x06121849 chip=3D0x061=
21b21
> > rev=3D0x01 hdr=3D0x00 vendor     =3D 'ASMedia Technology Inc.'
> >     device     =3D 'ASM1062 Serial ATA Controller'
> >     class      =3D mass storage
> >     subclass   =3D SATA
> >     bar   [10] =3D type I/O Port, range 32, base 0xe050, size 8, enabled
> >     bar   [14] =3D type I/O Port, range 32, base 0xe040, size 4, enabled
> >     bar   [18] =3D type I/O Port, range 32, base 0xe030, size 8, enabled
> >     bar   [1c] =3D type I/O Port, range 32, base 0xe020, size 4, enabled
> >     bar   [20] =3D type I/O Port, range 32, base 0xe000, size 32, enabl=
ed
> >     bar   [24] =3D type Memory, range 32, base 0xf7b00000, size 512,
> >     enabled
> > [...]
> >=20
> > Attached to that ASM1062 SATA chip, is a backup drive via eSATA
> > connector, a WD 4 TB RED drive. It seems, whenever I attach this drive
> > and it is online, I experience problems on the ZFS pool, which is
> > attached to the MCP SATA ports. =20
>=20
> How does this external drive get its power?  Are the earth grounds of
> both the system and the external drive power supply closely tied
> togeather?  A plug/unplug event with a slight ground creep can
> wreck havioc with device operation.

The external drive is housed in a external casing. Its PSU is de facto with=
 the same
"grounding" (earth ground) as the server's PSU, they share the same power p=
lug at its
point were the plug is comeing out of the wall - so to speak.

>=20
> > Is this possible? I mean, as I asked before, a weird/defect cabling
> > would trigger different error schemes (CRC errors). Due to the fact
> > that the external drive is physically decoupled and is not capable of
> > coupling in vibrations, bad sector errors seem to me unlikely. But this
> > is simply a though of someone without special knowledge about physics
> > of HDDs. =20
>=20
> Even if left cabled, does this drive get powered up/down? =20

The drive is cabled (eSATA) all the time, but is switched off for long time=
s (4 - 8 weeks
or 2 months, it depends, I switch it on for scrubbing or performing backups=
 of important
data).

>=20
> > I think people responding to my thread made it clear that the WD Green
> > isn't the first-choice-solution for a 20/6 (not 24/7) duty drive and
> > the fact, that they have serviced now more than 25000 hours, it would
> > be wise to replace them with alternatives.  =20
>=20
> I think someone had an apm command that turns off the head park,
> that would do wonders for drive life.   On the other hand, I think
> if it was my data and I saw that the drive had 2M head load cycles
> I would be looking to get out of that driv with any data I could
> not easily replace.  If it was well backed up or easily replaced
> my worries would be less.
>=20
> ... 275 lines removes ...

I'm prepared already, as stated, to change the drive(s), one by one.=20

Hopefully, ZFS is as reliable to me as it has been reliable for others ;-)

Kind regards,

Oliver


--=20
O. Hartmann

Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3=BCr
Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 Abs.=
 4 BDSG).

--Sig_/yQe_re/+qDm74aywZGbTQDl
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iLUEARMKAB0WIQQZVZMzAtwC2T/86TrS528fyFhYlAUCWjGB9wAKCRDS528fyFhY
lB3dAgCYHFdXDKgsrVXMr313TCddH11w6D9DtHlTEuOljeylnMlZrq8bcII+Vtpb
xFyj8Kgd8leRan64U5NKr5obOSPWAf9mUXR2PcHX+n8LwCoG4oKD0911LDBk523r
vUUc5uwGO3WdO9c4qDHlu8bywV1DQPh0Q3OIXLFuIIDjct8WYpdm
=Hlgc
-----END PGP SIGNATURE-----

--Sig_/yQe_re/+qDm74aywZGbTQDl--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171213203935.270e5f65>