From owner-freebsd-fs@FreeBSD.ORG Sun Feb 19 16:55:46 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 548241065672; Sun, 19 Feb 2012 16:55:46 +0000 (UTC) (envelope-from arno@heho.snv.jussieu.fr) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by mx1.freebsd.org (Postfix) with ESMTP id BBE0F8FC0A; Sun, 19 Feb 2012 16:55:45 +0000 (UTC) Received: from heho.snv.jussieu.fr (heho.snv.jussieu.fr [134.157.184.22]) by shiva.jussieu.fr (8.14.4/jtpda-5.4) with ESMTP id q1JGtInM021294 ; Sun, 19 Feb 2012 17:55:31 +0100 (CET) X-Ids: 168 Received: from heho.snv.jussieu.fr (localhost [127.0.0.1]) by heho.snv.jussieu.fr (8.14.3/8.14.3) with ESMTP id q1JGsoLU054604; Sun, 19 Feb 2012 17:54:50 +0100 (CET) (envelope-from arno@heho.snv.jussieu.fr) Received: (from arno@localhost) by heho.snv.jussieu.fr (8.14.3/8.14.3/Submit) id q1JGsoIr054599; Sun, 19 Feb 2012 17:54:50 +0100 (CET) (envelope-from arno) To: Martin Simmons From: "Arno J. Klaassen" References: <201202141820.q1EIK1MP032526@higson.cam.lispworks.com> Date: Sun, 19 Feb 2012 17:54:50 +0100 In-Reply-To: (Arno J. Klaassen's message of "Sat\, 18 Feb 2012 18\:55\:17 +0100") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Miltered: at jchkmail.jussieu.fr with ID 4F412976.000 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 4F412976.000/134.157.184.22/heho.snv.jussieu.fr/heho.snv.jussieu.fr/ Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: 9-stable: one-device ZFS fails [was: 9-stable : geli + one-disk ZFS fails] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Feb 2012 16:55:46 -0000 a followup to myself > Hello, > > Martin Simmons writes: > >> Some random ideas: >> >> 1) Can you dd the whole of ada0s3.eli without errors? >> >> 2) If you scrub a few more times, does it find the same number of errors each >> time and are they always in that XNAT.tar file? >> >> 3) Can you try zfs without geli? > > > yeah, and it seems to rule out geli : > > [ splitted original /dev/ada0s3 in equally sized /dev/ada0s3 and > /dev/ada0s4 ] > > geli init /dev/ada0s3 > geli attach /dev/ada0s3 > > zpool create zgeli /dev/ada0s3.eli > > zfs create zgeli/home > zfs create zgeli/home/arno > zfs create zgeli/home/arno/.priv > zfs create zgeli/home/arno/.scito > zfs set copies=2 zgeli/home/arno/.priv > zfs set atime=off zgeli > > > [put some files on it, wait a little : ] > > > [root@cc ~]# zpool status -v > pool: zgeli > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scan: scrub in progress since Sat Feb 18 17:46:54 2012 > 425M scanned out of 2.49G at 85.0M/s, 0h0m to go > 0 repaired, 16.64% done > config: > > NAME STATE READ WRITE CKSUM > zgeli ONLINE 0 0 1 > ada0s3.eli ONLINE 0 0 2 > > errors: Permanent errors have been detected in the following files: > > /zgeli/home/arno/8.0-CURRENT-200902-amd64-livefs.iso > [root@cc ~]# zpool scrub -s zgeli > [root@cc ~]# > > > [then idem directly on next partition ] > > zpool create zgpart /dev/ada0s4 > > zfs create zgpart/home > zfs create zgpart/home/arno > zfs create zgpart/home/arno/.priv > zfs create zgpart/home/arno/.scito > zfs set copies=2 zgpart/home/arno/.priv > zfs set atime=off zgpart > > [put some files on it, wait a little : ] > > pool: zgpart > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scan: scrub repaired 0 in 0h0m with 1 errors on Sat Feb 18 18:04:45 2012 > config: > > NAME STATE READ WRITE CKSUM > zgpart ONLINE 0 0 1 > ada0s4 ONLINE 0 0 2 > > errors: Permanent errors have been detected in the following files: > > /zgpart/home/arno/.scito/ .... > [root@cc ~]# I tested a bit more this afternoon : - zpool create zgpart /dev/ada0s4d => KO - split ada0s4 in two equally sized partitions and then zpool create zgpart mirror /dev/ada0s4d /dev/ada0s4e => works like a charm ..... ( [root@cc /zgpart]# zpool status -v zgpart pool: zgpart state: ONLINE scan: scrub repaired 0 in 0h36m with 0 errors on Sun Feb 19 17:20:34 2012 config: NAME STATE READ WRITE CKSUM zgpart ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0s4d ONLINE 0 0 0 ada0s4e ONLINE 0 0 0 errors: No known data errors ) FYI, best, Arno > > I still do not particuliarly suspect the disk since I cannot reproduce > similar behaviour on UFS. > > That said, this disk is supposed to be 'hybrid-SSD', maybe something > special ZFS doesn't like ??? : > > > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > ada0: ATA-8 SATA 2.x device > ada0: Serial Number 5YX0J5YD > ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada0: Command Queueing enabled > ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C) > ada0: Previously was known as ad4 > GEOM: new disk ada0 > > > Please let me know what information to provide more. > > Best, > > Arno > > > > >> 4) Is the slice/partition layout definitely correct? >> >> __Martin >> >> >>>>>>> On Mon, 13 Feb 2012 23:39:06 +0100, Arno J Klaassen said: >>> >>> hello, >>> >>> to eventually gain interest in this issue : >>> >>> I updated to today's -stable, tested with vfs.zfs.debug=1 >>> and vfs.zfs.prefetch_disable=0, no difference. >>> >>> I also tested to read the raw partition : >>> >>> [root@cc /usr/ports]# dd if=/dev/ada0s3 of=/dev/null bs=4096 conv=noerror >>> 103746636+0 records in >>> 103746636+0 records out >>> 424946221056 bytes transferred in 13226.346738 secs (32128768 bytes/sec) >>> [root@cc /usr/ports]# >>> >>> Disk is brand new, looks ok, either my setup is not good or there is >>> a bug somewhere; I can play around with this box for some more time, >>> please feel free to provide me with some hints what to do to be useful >>> for you. >>> >>> Best, >>> >>> Arno >>> >>> >>> "Arno J. Klaassen" writes: >>> >>> > Hello, >>> > >>> > >>> > I finally decided to 'play' a bit with ZFS on a notebook, some years >>> > old, but I installed a brand new disk and memtest passes OK. >>> > >>> > I installed base+ports on partition 2, using 'classical' UFS. >>> > >>> > I crypted partition 3 and created a single zpool on it containing >>> > 4 Z-"file-systems" : >>> > >>> > [root@cc ~]# zfs list >>> > NAME USED AVAIL REFER MOUNTPOINT >>> > zfiles 10.7G 377G 152K /zfiles >>> > zfiles/home 10.6G 377G 119M /zfiles/home >>> > zfiles/home/arno 10.5G 377G 2.35G /zfiles/home/arno >>> > zfiles/home/arno/.priv 192K 377G 192K /zfiles/home/arno/.priv >>> > zfiles/home/arno/.scito 8.18G 377G 8.18G /zfiles/home/arno/.scito >>> > >>> > >>> > I export the ZFS's via nfs and rsynced on the other machine some backup >>> > of my current note-book (geli + UFS, (almost) same 9-stable version, no >>> > problem) to the ZFS's. >>> > >>> > >>> > Quite fast, I see on the notebook : >>> > >>> > >>> > [root@cc /usr/temp]# zpool status -v >>> > pool: zfiles >>> > state: ONLINE >>> > status: One or more devices has experienced an error resulting in data >>> > corruption. Applications may be affected. >>> > action: Restore the file in question if possible. Otherwise restore the >>> > entire pool from backup. >>> > see: http://www.sun.com/msg/ZFS-8000-8A >>> > scan: scrub repaired 0 in 0h1m with 11 errors on Sat Feb 11 14:55:34 >>> > 2012 >>> > config: >>> > >>> > NAME STATE READ WRITE CKSUM >>> > zfiles ONLINE 0 0 11 >>> > ada0s3.eli ONLINE 0 0 23 >>> > >>> > errors: Permanent errors have been detected in the following files: >>> > >>> > /zfiles/home/arno/.scito/contrib/XNAT.tar >>> > [root@cc /usr/temp]# md5 /zfiles/home/arno/.scito/contrib/XNAT.tar >>> > md5: /zfiles/home/arno/.scito/contrib/XNAT.tar: Input/output error >>> > [root@cc /usr/temp]# >>> > >>> > >>> > As said, memtest is OK, nothing is logged to the console, UFS on the >>> > same disk works OK (I did some tests copying and comparing random data) >>> > and smartctl as well seems to trust the disk : >>> > >>> > SMART Self-test log structure revision number 1 >>> > Num Test_Description Status Remaining LifeTime(hours) >>> > # 1 Extended offline Completed without error 00% 388 >>> > # 2 Short offline Completed without error 00% 387 >>> > >>> > >>> > Am I doing something wrong and/or let me know what I could provide as >>> > extra info to try to solve this (dmesg.boot at the end of this mail). >>> > >>> > Thanx a lot in advance, >>> > >>> > best, Arno >>> > >>> > >>> > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"