From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 19:11:30 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9C0A1065670; Mon, 8 Nov 2010 19:11:30 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81]) by mx1.freebsd.org (Postfix) with ESMTP id C9E918FC1B; Mon, 8 Nov 2010 19:11:30 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 11:11:30 -0800 Message-ID: <4CD84B63.4030800@llnl.gov> Date: Mon, 08 Nov 2010 11:11:31 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CD84258.6090404@llnl.gov> <20101108190640.GA15661@icarus.home.lan> In-Reply-To: <20101108190640.GA15661@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" , "pjd@freebsd.org" Subject: Re: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 19:11:31 -0000 On 11/08/2010 11:06 AM, Jeremy Chadwick wrote: > On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote: >> I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) >> volumes together. >> >> Here is a quick rundown of the hardware: >> * HP DL180 G6 w/12GB ram >> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) >> * Winchester Hardware SAN, >> >> da2 at isp0 bus 0 scbus2 target 0 lun 0 >> da2: Fixed Direct Access SCSI-5 device >> da2: 800.000MB/s transfers >> da2: Command Queueing enabled >> da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) >> >> >> As soon as I create the volume and write data to it, it is reported >> as being corrupted: >> >> write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 >> write# zpool scrub filevol001dd if=/dev/random >> of=/filevol001/random.dat.1 bs=1m count=1000 >> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec) >> write# cd /filevol001/ >> write# ls >> random.dat.1 >> write# md5 * >> MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3 >> write# cp random.dat.1 random.dat.2 >> cp: random.dat.1: Input/output error >> write# zpool status >> pool: filevol001 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# zpool scrub filevol001 >> write# zpool status >> pool: filevol001 >> state: ONLINE >> status: One or more devices has experienced an error resulting in data >> corruption. Applications may be affected. >> action: Restore the file in question if possible. Otherwise restore the >> entire pool from backup. >> see: http://BLOCKEDwww.BLOCKEDsun.com/msg/ZFS-8000-8A >> scrub: scrub completed after 0h0m with 2437 errors on Mon Nov 8 >> 10:14:20 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 2.38K >> da2 ONLINE 0 0 1.24K 12K repaired >> da3 ONLINE 0 0 1.12K >> da4 ONLINE 0 0 1.13K >> da5 ONLINE 0 0 1.27K >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: 2437 data errors, use '-v' for a list >> >> However, if I create a 'raidz' volume, no errors occur: >> >> write# zpool destroy filevol001 >> write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8 >> write# zpool status >> pool: filevol001 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec) >> write# zpool scrub filevol001 >> >> dmesg output: >> write# zpool status >> pool: filevol001 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 09:54:51 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> filevol001 ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# ls >> random.dat.1 >> write# cp random.dat.1 random.dat.2 >> write# cp random.dat.1 random.dat.3 >> write# cp random.dat.1 random.dat.4 >> write# cp random.dat.1 random.dat.5 >> write# cp random.dat.1 random.dat.6 >> write# cp random.dat.1 random.dat.7 >> write# md5 * >> MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2 >> >> What is also odd, is if I create 7 separate ZFS volumes, they do not >> report any data corruption: >> >> write# zpool destroy filevol001 >> write# zpool create test01 da2 >> write# zpool create test02 da3 >> write# zpool create test03 da4 >> write# zpool create test04 da5 >> write# zpool create test05 da6 >> write# zpool create test06 da7 >> write# zpool create test07 da8 >> write# zpool status >> pool: test01 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test01 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test02 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test02 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test03 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test03 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test04 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test04 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test05 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test05 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test06 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test06 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test07 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> test07 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec) >> write# cd /tmp/ >> write# md5 /tmp/random.dat.1 >> MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp >> random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1 >> /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07 >> write# md5 /test*/* >> MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 >> write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03 >> ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ; >> zpool scrub test07 >> write# zpool status >> pool: test01 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:49 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test01 ONLINE 0 0 0 >> da2 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test02 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:52 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test02 ONLINE 0 0 0 >> da3 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test03 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:54 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test03 ONLINE 0 0 0 >> da4 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test04 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:27:57 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test04 ONLINE 0 0 0 >> da5 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test05 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:28:00 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test05 ONLINE 0 0 0 >> da6 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test06 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:28:02 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test06 ONLINE 0 0 0 >> da7 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: test07 >> state: ONLINE >> scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 >> 10:28:05 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> test07 ONLINE 0 0 0 >> da8 ONLINE 0 0 0 >> >> errors: No known data errors >> >> Based on these results, I've drawn the following conclusion: >> * ZFS single pool per device = OKAY >> * ZFS raidz of all devices = OKAY >> * ZFS stripe of all devices = NOT OKAY >> >> The results are immediate, and I know ZFS will self-heal, so is that >> what it is doing behind my back and just not reporting it? Is this a >> ZFS bug with striping vs. raidz? > Can you reproduce this problem using RELENG_8? Please try one of the > below snapshots. > > ftp://BLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/ > The server is in a data center with limited access control, do I have to option of using a particular CVS tag (checking out via csup) and then perform a make world/kernel? If so, I can report back later today, otherwise it might take longer :( Mike C