From owner-freebsd-current@FreeBSD.ORG Tue Sep 16 22:34:36 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A1205B0C for ; Tue, 16 Sep 2014 22:34:36 +0000 (UTC) Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de [130.133.4.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5987D7E5 for ; Tue, 16 Sep 2014 22:34:36 +0000 (UTC) Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost.zedat.fu-berlin.de (Exim 4.82) with esmtp (envelope-from ) id <1XU1KU-003Ta7-Is>; Wed, 17 Sep 2014 00:34:34 +0200 Received: from g226063043.adsl.alicedsl.de ([92.226.63.43] helo=thor.walstatt.dynvpn.de) by inpost2.zedat.fu-berlin.de (Exim 4.82) with esmtpsa (envelope-from ) id <1XU1KU-002F88-ES>; Wed, 17 Sep 2014 00:34:34 +0200 Date: Wed, 17 Sep 2014 00:34:33 +0200 From: "O. Hartmann" To: "Steven Hartland" Subject: Re: zpool: multiple IDs, CURRENT drops all pools after reboot Message-ID: <20140917003433.47f4318b.ohartman@zedat.fu-berlin.de> In-Reply-To: <27D574A82A374C2DAB8648ED7D2FDCD2@multiplay.co.uk> References: <20140916225346.10e0d4ae.ohartman@zedat.fu-berlin.de> <27D574A82A374C2DAB8648ED7D2FDCD2@multiplay.co.uk> Organization: FU Berlin X-Mailer: Claws Mail 3.10.1 (GTK+ 2.24.22; amd64-portbld-freebsd11.0) MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/qVTdNEP8m.EukU1M0zw5wDT"; protocol="application/pgp-signature" X-Originating-IP: 92.226.63.43 X-ZEDAT-Hint: A Cc: FreeBSD CURRENT X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Sep 2014 22:34:36 -0000 --Sig_/qVTdNEP8m.EukU1M0zw5wDT Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Am Tue, 16 Sep 2014 22:06:36 +0100 "Steven Hartland" schrieb: > > On of my backup drives dedicated to a ZPOOL is faulting and showing up = multiple ID. > > The only working ID is id: 257822624560506537. > >=20 > > FreeBSD CURRENT with three ZFS disks and only 4GB of RAM is very "flaky= " regarding > > this issue: today, tow times the whole poolset vanishes after a reboot.= Giving the > > box 8 GB total and rebooting doens't show the problem, it gets more fre= quent when > > reducing the RAM to 4GB (FreeBSD 11.0-CURRENT #2 r271684: Tue Sep 16 20= :41:47 CEST > > 2014). This is a bit spooky. > >=20 > > Below the faulted harddrive. I guess the drive/pool below shown trigger= s somehow the > > loss of all other pools (I have to import the other pools, which do not= have any > > defects, but they they drop out after a reboot and vanish). > >=20 > > Is there a way getting rid of the faulty IDs without destroying the poo= l? > >=20 > > Regards, > >=20 > > Oliver=20 > >=20 > > root@thor: [/etc] zpool import > > pool: BACKUP00 > > id: 9337833315545958689 > > state: FAULTED > > status: One or more devices contains corrupted data. > > action: The pool cannot be imported due to damaged devices or data. > > The pool may be active on another system, but can be imported u= sing > > the '-f' flag. > > see: http://illumos.org/msg/ZFS-8000-5E > > config: > >=20 > > BACKUP00 FAULTED corrupted data > > 8544670861382329237 UNAVAIL corrupted data > >=20 > > pool: BACKUP00 > > id: 257822624560506537 > > state: ONLINE > > action: The pool can be imported using its name or numeric identifier. > > config: > >=20 > > BACKUP00 ONLINE > > ada3p1 ONLINE > >=20 >=20 > Might be a long shot but check out the patches on: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594 >=20 > Specifically: > https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D147070 >=20 > And if that doesn't work: > https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D147286 >=20 > The second has all the changes from the first with the addition > of some changes which dynamically size the max dirty data. >=20 > These changes are in discussion and its likely the additions > in the second patch aren't the right direction but they > have been reported to show good improvements under high > memory pressure for certain workloads, so would be interesting > to see if they help with your problem. >=20 > All that said you shouldnt end up with corrupt data no matter > what. >=20 > Are there any other symptoms? Has memory been checked for > faults etc? >=20 > Regards > Steve The reason why my desktop has only 4 GB left is that I discovered memory co= rruption when equipted with 8 GB - there occured a strange bit flip. I can not assure tha= t by ripping off 4 GB (2 times 2GB, it is an old C2D/P45 based box) the problem has gone= . I susepct a dying chipset - when overheated (at the moment BIOS shows 80 degrees Cels= ius), the problem is more frequent. But, besindes data corruption, with 4 GB left and 2 disks put together as a= striped JBOD with another disk as the backup device (the faulty one) is a pain in t= he ass since the box starts swapping immediately when some action on the ZFS drives take= place. The plan is to keep that craveyward alive for the next 2 months until I can aff= ord a new system ;-) But anyway, I'll try the patches. Thanks, Oliver =20 --Sig_/qVTdNEP8m.EukU1M0zw5wDT Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBAgAGBQJUGLr5AAoJEOgBcD7A/5N8/XgH/RvvvGKmGu+3tWZkd3UiyIQT T6sT2iz62Ukf4to8oEOIsgG0cX2T/wPgixbKN5WXcvfKcxoQHhkesv6Im7ZG3s4I 0fVMgqEeXAC2VViEJiT+U4jMe8XKTyFOQnCIAPf3FMF/qn7XK20MUYIJIxwKs7GX 1HCOKw1eT3fxdoxy6yfU3H/mbwJtL30o+Bz3XzJmgsdPQG5u4fNlhHYNzswT3u0s BIbJ4yNPLJ/HUwxEVZPPsdmG8hihwPsp8c8Kk5zYBfPck9okDwdCG7skqcMfVWrh 1Uo4tgoDCPcqDjIVVP9Lorb1vkmQmJck7qMK1IW2Qv6zL73DokkdmevN68q9slk= =U8I8 -----END PGP SIGNATURE----- --Sig_/qVTdNEP8m.EukU1M0zw5wDT--