Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 1 Feb 2025 09:57:15 +0100
From:      A FreeBSD User <freebsd@walstatt-de.de>
To:        Allan Jude <allanjude@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: ZFS: Rescue FAULTED Pool
Message-ID:  <20250201095656.1bdfbe5f@thor.sb211.local>
In-Reply-To: <980401eb-f8f6-44c7-8ee1-5ff0c9e1c35c@freebsd.org>
References:  <20250129112701.0c4a3236@freyja> <Z5oU1dLX4eQaN8Yq@albert.catwhisker.org> <20250130123354.2d767c7c@thor.sb211.local> <980401eb-f8f6-44c7-8ee1-5ff0c9e1c35c@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/pfQSL+v5QfLrNu5ora.cKlu
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Am Thu, 30 Jan 2025 16:13:56 -0500
Allan Jude <allanjude@freebsd.org> schrieb:

> On 1/30/2025 6:35 AM, A FreeBSD User wrote:
> > Am Wed, 29 Jan 2025 03:45:25 -0800
> > David Wolfskill <david@catwhisker.org> schrieb:
> >=20
> > Hello, thanks for responding.
> >  =20
> >> On Wed, Jan 29, 2025 at 11:27:01AM +0100, FreeBSD User wrote: =20
> >>> Hello,
> >>>
> >>> a ZFS pool (RAINDZ(1)) has been faulted. The pool is not importable
> >>> anymore. neither with import -F/-f.
> >>> Although this pool is on an experimental system (no backup available)
> >>> it contains some data to reconstruct them would take a while, so I'd
> >>> like to ask whether there is a way to try to "de-fault" such a pool. =
=20
> >>
> >> Well, 'zpool clear ...' "Clears device errors in a pool." (from "man
> >> zpool".
> >>
> >> It is, however, not magic -- it doesn't actually fix anything. =20
> >=20
> > For the record: I tried EVERY network/search  available method useful f=
or common
> > "administrators", but hoped people are abe to manipulate deeper stuff v=
ia zdb ...
> >  =20
> >>
> >> (I had an issue with a zpool which had a single SSD device as a ZIL; t=
he
> >> ZIL device failed after it had accepted some data to be written to the
> >> pool, but before the data could be read and transferred to the spinning
> >> disks.  ZFS was quite unhappy about that.  I was eventually able to co=
py
> >> the data elsewhere, destroy the old zpool, recreate it *without* that
> >> single point of failure, then copy the data back.  And I learned to
> >> never create a zpool with a *single* device as a separate ZIL.) =20
> >=20
> > Well, in this case I do not use dedicated ZIL drives. I also made sever=
al experiences with
> > "single" ZIL drive setups, but a dedicated ZIL is mostly useful in case=
s were you have
> > graveyard full of inertia-suffering, mass-spinning HDDs - if I'm right =
the concept of SSD
> > based ZIL would be of no use/effect in that case. So I ommited tose.
> >  =20
> >> =20
> >>> The pool is comprised from 7 drives as a RAIDZ1, one of the SSDs
> >>> faulted but I pulled the wrong one, so the pool ran into suspended
> >>> state. =20
> >>
> >> Can you put the drive you pulled back in? =20
> >=20
> > Every single SSD originally plugged in is now back in place, even the f=
aulted one (which
> > doesn't report any faults at the moment).
> >=20
> > Although the pool isn't "importable", zdb reports its existence, amongs=
t zroot (which
> > resides on a dedicated drive).
> >  =20
> >> =20
> >>> The host is running the lates Xigmanas BETA, which is effectively
> >>> FreeBSD 14.1-p2, just for the record.
> >>>
> >>> I do not want to give up, since I hoped there might be a rude but
> >>> effective way to restore the pool even under datalosses ...
> >>>
> >>> Thanks in advance,
> >>>
> >>> Oliver
> >>> .... =20
> >>
> >> Good luck!
> >>
> >> Peace,
> >> david =20
> >=20
> >=20
> > Well, this is a hard and painful lecture to learn, if there is no chanc=
e to get back the
> > pool.
> >=20
> > A warning (but this seems to be useless in the realm of professionals):=
 I used a bunch of
> > cheap spotmarket SATA SSDs, a brand called "Intenso" common also here i=
n Good old Germany.
> > Some of those SSDs do have working LED when used with a Fujitsu SAS HBA=
 controller - but
> > those died very quickly from suffering some bus errors. Another bunch o=
f those SSDs do not
> > have working LED (not blinking on access), but lasted a bit longer. The=
 problem with those
> > SSDs is: I can not find the failing device easily by accessing the fail=
ed drive by writing
> > massive data via dd, if possible.
> > I also ordered alternative SSDs from a more expensive brand - but bad K=
arma ...
> >=20
> > Oliver
> >=20
> >  =20
>=20
> The most useful thing to share right now would be the output of `zpool=20
> import` (with no pool name) on the rebooted system.
>=20
> That will show where the issues are, and suggest how they might be solved.
>=20

Hello, this exactly happens when trying to import the pool. Prior to the lo=
ss, device da1p1
has been faulted with numbers in the colum/columns "corrupted data"/further=
 not seen now.


 ~# zpool import
   pool: BUNKER00
     id: XXXXXXXXXXXXXXXXXXXX
  state: FAULTED
status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
 config:

        BUNKER00    FAULTED  corrupted data
          raidz1-0  ONLINE
            da2p1   ONLINE
            da3p1   ONLINE
            da4p1   ONLINE
            da7p1   ONLINE
            da6p1   ONLINE
            da1p1   ONLINE
            da5p1   ONLINE


 ~# zpool import -f BUNKER00
cannot import 'BUNKER00': I/O error
        Destroy and re-create the pool from
        a backup source.


~# zpool import -F BUNKER00
cannot import 'BUNKER00': one or more devices is currently unavailable

--=20

A FreeBSD user

--Sig_/pfQSL+v5QfLrNu5ora.cKlu
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iHUEARYKAB0WIQRQheDybVktG5eW/1Kxzvs8OqokrwUCZ53iBgAKCRCxzvs8Oqok
r6sjAP4/5+HPz0EImGkLPE9Z2D5y5YNwlAJIg9c/DZrUF4UJCAD/QXeV+iCOl8f/
/23e1CzJFzdY8+/ARqKmeno7BAa0/gU=
=konu
-----END PGP SIGNATURE-----

--Sig_/pfQSL+v5QfLrNu5ora.cKlu--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20250201095656.1bdfbe5f>