Date: Thu, 14 Dec 2017 00:07:24 +0200 From: Daniel Kalchev <daniel@digsys.bg> To: "O. Hartmann" <o.hartmann@walstatt.org> Cc: "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>, Cy Schubert <Cy.Schubert@komquats.com>, "O. Hartmann" <ohartmann@walstatt.org>, FreeBSD CURRENT <freebsd-current@freebsd.org>, Freddie Cash <fjwcash@gmail.com>, Alan Somers <asomers@freebsd.org> Subject: Re: SMART: disk problems on RAIDZ1 pool: (ada6:ahcich6:0:0:0): CAMstatus: ATA Status Error Message-ID: <E18C1AE8-0450-4563-9093-5C84E937BD5C@digsys.bg> In-Reply-To: <20171213203935.270e5f65@thor.intern.walstatt.dynvpn.de> References: <20171213161116.1889f178@hermann> <201712131647.vBDGlrf2092528@pdx.rh.CN85.dnsmgr.net> <20171213203935.270e5f65@thor.intern.walstatt.dynvpn.de>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 13 Dec 2017, at 21:39, O. Hartmann <o.hartmann@walstatt.org> wrote: >=20 > Am Wed, 13 Dec 2017 08:47:53 -0800 (PST) > "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net> schrieb: >=20 >>> On Tue, 12 Dec 2017 14:58:28 -0800 >>> Cy Schubert <Cy.Schubert@komquats.com> wrote: >>>=20 >>>> There are a couple of ways you can address this. You'll need to >>>> offline the vdev first. If you've done a smartcrl -t long and if the >>>> test failed, smartcrl -a will tell you which block it had an issue >>>> with. You can use dd, ddrescue or dd_rescue to dd the block over >>>> itself. The drive may rewrite the (weak) block or if it fails to it >>>> will remap it (subsequently showing as reallocated). >>>>=20 >>>> Of course there is a risk. If the sector is any of the boot blocks >>>> there is a good chance the server will hang. =20 >>>=20 >>> The drive is part of a dedicated storage-only pool. The boot drive is a >>> fast SSD. So I do not care about this - well, to say it more politely: >>> I do not have to take care of that aspect. >>>=20 >>>>=20 >>>> You have to be *absolutely* sure which the bad sector is. And, there >>>> may be more. There is a risk of data loss. >>>>=20 >>>> I've used this technique many times. Most times it works perfectly. >>>> Other times the affected file is lost but the rest of the file system >>>> is recovered. And again there is always the risk. >>>>=20 >>>> Replace the disk immediately if you experience a growing succession >>>> of pending sectors. Otherwise replace the disk at your earliest >>>> convenience. =20 >>>=20 >>> The ZFS scrubbing of the volume ended this morning, leaving the pool in >>> a healthy state. After reboot, there was no sign of CAM errors again. >>>=20 >>> But there is something else I'm worried about. The mainboard I use is a=20= >>>=20 >>> ASRock Z77 Pro4-M. >>> The board has a cripple Intel MCP with 6 SATA ports from the chipset, >>> two of them SATA 6GB, 4 SATA II, and one additional chip with two SATA >>> 6GB ports: >>>=20 >>> [...] >>> ahci0@pci0:2:0:0: class=3D0x010601 card=3D0x06121849 chip=3D0x0612= 1b21 >>> rev=3D0x01 hdr=3D0x00 vendor =3D 'ASMedia Technology Inc.' >>> device =3D 'ASM1062 Serial ATA Controller' >>> class =3D mass storage >>> subclass =3D SATA >>> bar [10] =3D type I/O Port, range 32, base 0xe050, size 8, enabled >>> bar [14] =3D type I/O Port, range 32, base 0xe040, size 4, enabled >>> bar [18] =3D type I/O Port, range 32, base 0xe030, size 8, enabled >>> bar [1c] =3D type I/O Port, range 32, base 0xe020, size 4, enabled >>> bar [20] =3D type I/O Port, range 32, base 0xe000, size 32, enabled= >>> bar [24] =3D type Memory, range 32, base 0xf7b00000, size 512, >>> enabled >>> [...] >>>=20 >>> Attached to that ASM1062 SATA chip, is a backup drive via eSATA >>> connector, a WD 4 TB RED drive. It seems, whenever I attach this drive >>> and it is online, I experience problems on the ZFS pool, which is >>> attached to the MCP SATA ports. =20 >>=20 >> How does this external drive get its power? Are the earth grounds of >> both the system and the external drive power supply closely tied >> togeather? A plug/unplug event with a slight ground creep can >> wreck havioc with device operation. >=20 > The external drive is housed in a external casing. Its PSU is de facto wit= h the same > "grounding" (earth ground) as the server's PSU, they share the same power p= lug at its > point were the plug is comeing out of the wall - so to speak. Most external drive power supplies are not grounded. At least none I ever sa= w had grounded plugs for the mains cable. Might be, yours has it... Worth checking anyway. Daniel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E18C1AE8-0450-4563-9093-5C84E937BD5C>