Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Sep 2014 00:34:33 +0200
From:      "O. Hartmann" <ohartman@zedat.fu-berlin.de>
To:        "Steven Hartland" <killing@multiplay.co.uk>
Cc:        FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: zpool: multiple IDs, CURRENT drops all pools after reboot
Message-ID:  <20140917003433.47f4318b.ohartman@zedat.fu-berlin.de>
In-Reply-To: <27D574A82A374C2DAB8648ED7D2FDCD2@multiplay.co.uk>
References:  <20140916225346.10e0d4ae.ohartman@zedat.fu-berlin.de> <27D574A82A374C2DAB8648ED7D2FDCD2@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/qVTdNEP8m.EukU1M0zw5wDT
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Am Tue, 16 Sep 2014 22:06:36 +0100
"Steven Hartland" <killing@multiplay.co.uk> schrieb:

> > On of my backup drives dedicated to a ZPOOL is faulting and showing up =
multiple ID.
> > The only working ID is id: 257822624560506537.
> >=20
> > FreeBSD CURRENT with three ZFS disks and only 4GB of RAM is very "flaky=
" regarding
> > this issue: today, tow times the whole poolset vanishes after a reboot.=
 Giving the
> > box 8 GB total and rebooting doens't show the problem, it gets more fre=
quent when
> > reducing the RAM to 4GB (FreeBSD 11.0-CURRENT #2 r271684: Tue Sep 16 20=
:41:47 CEST
> > 2014). This is a bit spooky.
> >=20
> > Below the faulted harddrive. I guess the drive/pool below shown trigger=
s somehow the
> > loss of all other pools (I have to import the other pools, which do not=
 have any
> > defects, but they they drop out after a reboot and vanish).
> >=20
> > Is there a way getting rid of the faulty IDs without destroying the poo=
l?
> >=20
> > Regards,
> >=20
> > Oliver=20
> >=20
> >  root@thor: [/etc] zpool import
> >    pool: BACKUP00
> >      id: 9337833315545958689
> >   state: FAULTED
> >  status: One or more devices contains corrupted data.
> >  action: The pool cannot be imported due to damaged devices or data.
> >         The pool may be active on another system, but can be imported u=
sing
> >         the '-f' flag.
> >    see: http://illumos.org/msg/ZFS-8000-5E
> >  config:
> >=20
> >         BACKUP00               FAULTED  corrupted data
> >           8544670861382329237  UNAVAIL  corrupted data
> >=20
> >    pool: BACKUP00
> >      id: 257822624560506537
> >   state: ONLINE
> >  action: The pool can be imported using its name or numeric identifier.
> >  config:
> >=20
> >         BACKUP00    ONLINE
> >           ada3p1    ONLINE
> >=20
>=20
> Might be a long shot but check out the patches on:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594
>=20
> Specifically:
> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D147070
>=20
> And if that doesn't work:
> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D147286
>=20
> The second has all the changes from the first with the addition
> of some changes which dynamically size the max dirty data.
>=20
> These changes are in discussion and its likely the additions
> in the second patch aren't the right direction but they
> have been reported to show good improvements under high
> memory pressure for certain workloads, so would be interesting
> to see if they help with your problem.
>=20
> All that said you shouldnt end up with corrupt data no matter
> what.
>=20
> Are there any other symptoms? Has memory been checked for
> faults etc?
>=20
>     Regards
>     Steve

The reason why my desktop has only 4 GB left is that I discovered memory co=
rruption when
equipted with 8 GB - there occured a strange bit flip. I can not assure tha=
t by ripping
off 4 GB (2 times 2GB, it is an old C2D/P45 based box) the problem has gone=
. I susepct
a dying chipset - when overheated (at the moment BIOS shows 80 degrees Cels=
ius), the
problem is more frequent.

But, besindes data corruption, with 4 GB left and 2 disks put together as a=
 striped
JBOD with another disk as the backup device (the faulty one) is a pain in t=
he ass since
the box starts swapping immediately when some action on the ZFS drives take=
 place. The
plan is to keep that craveyward alive for the next 2 months until I can aff=
ord a new
system ;-)

But anyway, I'll try the patches.

Thanks,
Oliver =20


--Sig_/qVTdNEP8m.EukU1M0zw5wDT
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJUGLr5AAoJEOgBcD7A/5N8/XgH/RvvvGKmGu+3tWZkd3UiyIQT
T6sT2iz62Ukf4to8oEOIsgG0cX2T/wPgixbKN5WXcvfKcxoQHhkesv6Im7ZG3s4I
0fVMgqEeXAC2VViEJiT+U4jMe8XKTyFOQnCIAPf3FMF/qn7XK20MUYIJIxwKs7GX
1HCOKw1eT3fxdoxy6yfU3H/mbwJtL30o+Bz3XzJmgsdPQG5u4fNlhHYNzswT3u0s
BIbJ4yNPLJ/HUwxEVZPPsdmG8hihwPsp8c8Kk5zYBfPck9okDwdCG7skqcMfVWrh
1Uo4tgoDCPcqDjIVVP9Lorb1vkmQmJck7qMK1IW2Qv6zL73DokkdmevN68q9slk=
=U8I8
-----END PGP SIGNATURE-----

--Sig_/qVTdNEP8m.EukU1M0zw5wDT--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140917003433.47f4318b.ohartman>