From owner-freebsd-fs@FreeBSD.ORG Fri Feb 21 00:52:03 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3612172A for ; Fri, 21 Feb 2014 00:52:03 +0000 (UTC) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0458C16A2 for ; Fri, 21 Feb 2014 00:52:02 +0000 (UTC) Received: from compute5.internal (compute5.nyi.mail.srv.osa [10.202.2.45]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id D746C20C53 for ; Thu, 20 Feb 2014 19:52:00 -0500 (EST) Received: from web2 ([10.202.2.212]) by compute5.internal (MEProxy); Thu, 20 Feb 2014 19:52:00 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:from:to:mime-version :content-transfer-encoding:content-type:in-reply-to:references :subject:date; s=smtpout; bh=NPpCBa/QGIqZGPtOjtZsuSFVz5Y=; b=jbI ZR2mmm6MrPshNxh11k3UUZAdBbLz80f7QU+YYpeX1ViuKZB9mEbQiBKYhi7BX0+Y 7PMv14P/adt61oApk24IDF7NnucVzbbsE0Cd1agjkmSZo0azeveemtmGiMWYTxlt bQURwB/7ozpDC7ACXL2YZAIJQ/U9qW0MGWzCxKZo= Received: by web2.nyi.mail.srv.osa (Postfix, from userid 99) id B41315400AC; Thu, 20 Feb 2014 19:52:00 -0500 (EST) Message-Id: <1392943920.2273.85942093.30069072@webmail.messagingengine.com> X-Sasl-Enc: tO7YpYr2HZsuUuTh3DNudOePY7aJlpxxmmLo1LGN3tlg 1392943920 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: MessagingEngine.com Webmail Interface - ajax-4527a23f In-Reply-To: References: Subject: Re: Possible ZFS bug? Insufficient sanity checks Date: Thu, 20 Feb 2014 18:52:00 -0600 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Feb 2014 00:52:03 -0000 On Wed, Feb 19, 2014, at 8:47, Borja Marcos wrote: >=20 > Hello, >=20 > Doing something stupid I managed to corrupt a ZFS pool. I think it > shouldn=B4t have been possible. I hope to reproduce it next week, but > it's better to share just in case.=20 >=20 > I know what I did was quite foolish, and no dolphins were hurt as it's > just a test machine. >=20 > FreeBSD pruebassd 10.0-STABLE FreeBSD 10.0-STABLE #8: Wed Feb 12 09:32:29 > UTC 2014 root@pruebassd:/usr/obj/usr/src/sys/PRUEBASSD2_10 amd64 >=20 > The pool has one RAIDZ vdev, with 6 OCZ Vertex 4 SSDs. >=20 > The stupid manoeuvre was as follows: >=20 > 1) Pick up one of the disks at random. >=20 > 2) Extract it. >=20 > So far so good. zpool warns that the pool is in degraded state, but > everythng works. >=20 > 3) Take the disk to a different system. Insert it and create a new pool > on it. Just one disk, I was testing a data corruption issue with a "mfi" > adapter. >=20 > 4) Do some tests. >=20 > 5) Probably (not sure) destroy the newly created pool. >=20 > 6) take the ssd to the original machine -> insert it >=20 > And here the fun comes. >=20 > 7) zpool online cashopul (the previously removed disk) >=20 > 8) KABOOM! zpool warns of data corruption all over the place. -> most > files corrupted. >=20 >=20 >=20 > My theory: When doing the "zpool online" ZFS just checked the disk > serial number or identification, and, being the same, *not verifying the > pool identity* it mixed it into the pool with disastrous consequences. >=20 > What I think should have happened instead: >=20 > - ZFS should verify the physical disk "identity" *and* verify that the > ZFS metadata on the disk indeed belongs to the pool on which it's being > "onlined". >=20 >=20 > Again, I do know that I did something very foolish (I behave in a foolish > and careless way with that machine on purpose). >=20 > I'll try to reproduce this next week (I'm waiting to receive some SAS=20 > cables). >=20 >=20 I'm curious: on both machines did the zpools share the same name?