Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 May 2015 08:29:25 +0200
From:      Simon Campese <freebsd_fs@campese.de>
To:        freebsd-fs@freebsd.org
Subject:   Re: hardware fault during ZFS send/receive blocks /dev/zfs indefinitely
Message-ID:  <867fs3danu.fsf@emacs.campese.org>
In-Reply-To: <86wq048x8h.fsf@emacs.campese.org>
References:  <86wq048x8h.fsf@emacs.campese.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello,

>Can you try using the geli and=C3=B8r glabel command to force detach
>label/bkp101.eli so zfs treats it as a failure?  Also I'm not sure how
>geli and glabel will treat it but you could try sysctl
>kern.cam.ada.retry_count=3D0 to make the kernel give up on the disk
>quicker and the "failure"might cascade up to zfs where it should
>hopefully give up on the disk.  I think the problem here is ZFS does not
>know about the incomplete failures on the lower layers.

I've tried that already (forcefully closing the geli device and removing
the label) but it doesn't change anything. In fact, this happened
automatically as the drive stopped reacting after some time. So in the
end it had to be a reboot.
I can replicate the situation on a test machine with the same hardware
but, strangely enough, it turns out that the drive seems to be in
perfectly fine condition (and it should be, used it as cold storage, it
is 5 years old but was powered up <30 times for just a couple of hours).=20=
=20=20

I will open another thread for this but for the moment, let me
tell you the situation: I've created a label on the naked drive and put
a geli volume on it. On this geli volume, I can read and write with dd
just fine (tested data up to 10G), but if I put a zfs pool on it and
either write directly to it, or via send/receive, the drive stops
reacting after some seconds (the "write" hdd light stays on
indefinitely) and I start getting the CAM errors.=20

To rule out a FreeBSD specific issue, the disk is tested in a linux
machine right now (where it has been working before). It already passed
an extended SMART test without any errors and right now is in the middle
of a badblocks check (also without errors so far). If no errors are
found, I will put it back into my FreeBSD test machine and try to put a
plain ufs filesystem on it.


Best,

Simon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?867fs3danu.fsf>