From owner-freebsd-fs@FreeBSD.ORG Mon Nov 8 19:01:33 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E87C3106564A for ; Mon, 8 Nov 2010 19:01:33 +0000 (UTC) (envelope-from carlson39@llnl.gov) Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81]) by mx1.freebsd.org (Postfix) with ESMTP id D32CA8FC13 for ; Mon, 8 Nov 2010 19:01:33 +0000 (UTC) X-Attachments: None Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135]) by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 10:32:55 -0800 Message-ID: <4CD84258.6090404@llnl.gov> Date: Mon, 08 Nov 2010 10:32:56 -0800 From: Mike Carlson User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: freebsd-fs@freebsd.org, pjd@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: 8.1-RELEASE: ZFS data errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 19:01:34 -0000 I'm having a problem with stripping 7 18TB RAID6 (hardware SAN) volumes together. Here is a quick rundown of the hardware: * HP DL180 G6 w/12GB ram * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter) * Winchester Hardware SAN, da2 at isp0 bus 0 scbus2 target 0 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 800.000MB/s transfers da2: Command Queueing enabled da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C) As soon as I create the volume and write data to it, it is reported as being corrupted: write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8 write# zpool scrub filevol001dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec) write# cd /filevol001/ write# ls random.dat.1 write# md5 * MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3 write# cp random.dat.1 random.dat.2 cp: random.dat.1: Input/output error write# zpool status pool: filevol001 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# zpool scrub filevol001 write# zpool status pool: filevol001 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 0h0m with 2437 errors on Mon Nov 8 10:14:20 2010 config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 2.38K da2 ONLINE 0 0 1.24K 12K repaired da3 ONLINE 0 0 1.12K da4 ONLINE 0 0 1.13K da5 ONLINE 0 0 1.27K da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: 2437 data errors, use '-v' for a list However, if I create a 'raidz' volume, no errors occur: write# zpool destroy filevol001 write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8 write# zpool status pool: filevol001 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec) write# zpool scrub filevol001 dmesg output: write# zpool status pool: filevol001 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 09:54:51 2010 config: NAME STATE READ WRITE CKSUM filevol001 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# ls random.dat.1 write# cp random.dat.1 random.dat.2 write# cp random.dat.1 random.dat.3 write# cp random.dat.1 random.dat.4 write# cp random.dat.1 random.dat.5 write# cp random.dat.1 random.dat.6 write# cp random.dat.1 random.dat.7 write# md5 * MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2 MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2 What is also odd, is if I create 7 separate ZFS volumes, they do not report any data corruption: write# zpool destroy filevol001 write# zpool create test01 da2 write# zpool create test02 da3 write# zpool create test03 da4 write# zpool create test04 da5 write# zpool create test05 da6 write# zpool create test06 da7 write# zpool create test07 da8 write# zpool status pool: test01 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test01 ONLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors pool: test02 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test02 ONLINE 0 0 0 da3 ONLINE 0 0 0 errors: No known data errors pool: test03 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test03 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors pool: test04 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test04 ONLINE 0 0 0 da5 ONLINE 0 0 0 errors: No known data errors pool: test05 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test05 ONLINE 0 0 0 da6 ONLINE 0 0 0 errors: No known data errors pool: test06 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test06 ONLINE 0 0 0 da7 ONLINE 0 0 0 errors: No known data errors pool: test07 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM test07 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec) write# cd /tmp/ write# md5 /tmp/random.dat.1 MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1 /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07 write# md5 /test*/* MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36 write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03 ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ; zpool scrub test07 write# zpool status pool: test01 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:49 2010 config: NAME STATE READ WRITE CKSUM test01 ONLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors pool: test02 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:52 2010 config: NAME STATE READ WRITE CKSUM test02 ONLINE 0 0 0 da3 ONLINE 0 0 0 errors: No known data errors pool: test03 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:54 2010 config: NAME STATE READ WRITE CKSUM test03 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors pool: test04 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:27:57 2010 config: NAME STATE READ WRITE CKSUM test04 ONLINE 0 0 0 da5 ONLINE 0 0 0 errors: No known data errors pool: test05 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:28:00 2010 config: NAME STATE READ WRITE CKSUM test05 ONLINE 0 0 0 da6 ONLINE 0 0 0 errors: No known data errors pool: test06 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:28:02 2010 config: NAME STATE READ WRITE CKSUM test06 ONLINE 0 0 0 da7 ONLINE 0 0 0 errors: No known data errors pool: test07 state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Mon Nov 8 10:28:05 2010 config: NAME STATE READ WRITE CKSUM test07 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors Based on these results, I've drawn the following conclusion: * ZFS single pool per device = OKAY * ZFS raidz of all devices = OKAY * ZFS stripe of all devices = NOT OKAY The results are immediate, and I know ZFS will self-heal, so is that what it is doing behind my back and just not reporting it? Is this a ZFS bug with striping vs. raidz?