From owner-freebsd-fs@FreeBSD.ORG Wed Feb 19 14:47:47 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9A9A0E5B for ; Wed, 19 Feb 2014 14:47:47 +0000 (UTC) Received: from cu01176a.smtpx.saremail.com (cu01176a.smtpx.saremail.com [195.16.150.151]) by mx1.freebsd.org (Postfix) with ESMTP id 59A691D32 for ; Wed, 19 Feb 2014 14:47:47 +0000 (UTC) Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11]) by proxypop03.sare.net (Postfix) with ESMTPSA id 875C09DCAEF for ; Wed, 19 Feb 2014 15:47:40 +0100 (CET) From: Borja Marcos Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Possible ZFS bug? Insufficient sanity checks Date: Wed, 19 Feb 2014 15:47:35 +0100 Message-Id: To: "freebsd-fs@FreeBSD.org Filesystems" Mime-Version: 1.0 (Apple Message framework v1283) X-Mailer: Apple Mail (2.1283) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Feb 2014 14:47:47 -0000 Hello, Doing something stupid I managed to corrupt a ZFS pool. I think it = shouldn=B4t have been possible. I hope to reproduce it next week, but it's better to share just in case.=20 I know what I did was quite foolish, and no dolphins were hurt as it's = just a test machine. FreeBSD pruebassd 10.0-STABLE FreeBSD 10.0-STABLE #8: Wed Feb 12 = 09:32:29 UTC 2014 root@pruebassd:/usr/obj/usr/src/sys/PRUEBASSD2_10 = amd64 The pool has one RAIDZ vdev, with 6 OCZ Vertex 4 SSDs. The stupid manoeuvre was as follows: 1) Pick up one of the disks at random. 2) Extract it. So far so good. zpool warns that the pool is in degraded state, but = everythng works. 3) Take the disk to a different system. Insert it and create a new pool = on it. Just one disk, I was testing a data corruption issue with a "mfi" = adapter. 4) Do some tests. 5) Probably (not sure) destroy the newly created pool. 6) take the ssd to the original machine -> insert it And here the fun comes. 7) zpool online cashopul (the previously removed disk) 8) KABOOM! zpool warns of data corruption all over the place. -> most = files corrupted. My theory: When doing the "zpool online" ZFS just checked the disk = serial number or identification, and, being the same, *not verifying the = pool identity* it mixed it into the pool with disastrous consequences. What I think should have happened instead: - ZFS should verify the physical disk "identity" *and* verify that the = ZFS metadata on the disk indeed belongs to the pool on which it's being = "onlined". Again, I do know that I did something very foolish (I behave in a = foolish and careless way with that machine on purpose). I'll try to reproduce this next week (I'm waiting to receive some SAS = cables). Cheers, Borja.