From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 19:29:54 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 52DE41065673
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:29:53 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta02.westchester.pa.mail.comcast.net
	(qmta02.westchester.pa.mail.comcast.net [76.96.62.24])
	by mx1.freebsd.org (Postfix) with ESMTP id F14618FC17
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:29:52 +0000 (UTC)
Received: from omta12.westchester.pa.mail.comcast.net ([76.96.62.44])
	by qmta02.westchester.pa.mail.comcast.net with comcast
	id UcB91f0040xGWP852jVtav; Mon, 08 Nov 2010 19:29:53 +0000
Received: from koitsu.dyndns.org ([98.248.41.155])
	by omta12.westchester.pa.mail.comcast.net with comcast
	id UjVs1f0013LrwQ23YjVskQ; Mon, 08 Nov 2010 19:29:53 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id CD4379B427; Mon,  8 Nov 2010 11:29:50 -0800 (PST)
Date: Mon, 8 Nov 2010 11:29:50 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Mike Carlson <carlson39@llnl.gov>
Message-ID: <20101108192950.GA15902@icarus.home.lan>
References: <4CD84258.6090404@llnl.gov>
	<20101108190640.GA15661@icarus.home.lan>
	<4CD84B63.4030800@llnl.gov>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4CD84B63.4030800@llnl.gov>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>,
	"pjd@freebsd.org" <pjd@freebsd.org>
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 19:29:55 -0000

On Mon, Nov 08, 2010 at 11:11:31AM -0800, Mike Carlson wrote:
> On 11/08/2010 11:06 AM, Jeremy Chadwick wrote:
> >On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote:
> >>I'm having a problem with  stripping 7 18TB RAID6 (hardware SAN)
> >>volumes together.
> >>
> >>Here is a quick rundown of the hardware:
> >>* HP DL180 G6 w/12GB ram
> >>* QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter)
> >>* Winchester Hardware SAN,
> >>
> >>    da2 at isp0 bus 0 scbus2 target 0 lun 0
> >>    da2:<WINSYS SX2318R 373O>  Fixed Direct Access SCSI-5 device
> >>    da2: 800.000MB/s transfers
> >>    da2: Command Queueing enabled
> >>    da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C)
> >>
> >>
> >>As soon as I create the volume and write data to it, it is reported
> >>as being corrupted:
> >>
> >>    write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
> >>    write# zpool scrub filevol001dd if=/dev/random
> >>    of=/filevol001/random.dat.1 bs=1m count=1000
> >>    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
> >>    1000+0 records in
> >>    1000+0 records out
> >>    1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec)
> >>    write# cd /filevol001/
> >>    write# ls
> >>    random.dat.1
> >>    write# md5 *
> >>    MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3
> >>    write# cp random.dat.1 random.dat.2
> >>    cp: random.dat.1: Input/output error
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0     0
> >>           da2       ONLINE       0     0     0
> >>           da3       ONLINE       0     0     0
> >>           da4       ONLINE       0     0     0
> >>           da5       ONLINE       0     0     0
> >>           da6       ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# zpool scrub filevol001
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>    status: One or more devices has experienced an error resulting in data
> >>         corruption.  Applications may be affected.
> >>    action: Restore the file in question if possible.  Otherwise restore the
> >>         entire pool from backup.
> >>        see: http://BLOCKEDwww.BLOCKEDsun.com/msg/ZFS-8000-8A
> >>      scrub: scrub completed after 0h0m with 2437 errors on Mon Nov  8
> >>    10:14:20 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0 2.38K
> >>           da2       ONLINE       0     0 1.24K  12K repaired
> >>           da3       ONLINE       0     0 1.12K
> >>           da4       ONLINE       0     0 1.13K
> >>           da5       ONLINE       0     0 1.27K
> >>           da6       ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: 2437 data errors, use '-v' for a list
> >>
> >>However, if I create a 'raidz' volume, no errors occur:
> >>
> >>    write# zpool destroy filevol001
> >>    write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0     0
> >>           raidz1    ONLINE       0     0     0
> >>             da2     ONLINE       0     0     0
> >>             da3     ONLINE       0     0     0
> >>             da4     ONLINE       0     0     0
> >>             da5     ONLINE       0     0     0
> >>             da6     ONLINE       0     0     0
> >>             da7     ONLINE       0     0     0
> >>             da8     ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
> >>    1000+0 records in
> >>    1000+0 records out
> >>    1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec)
> >>    write# zpool scrub filevol001
> >>
> >>    dmesg output:
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    09:54:51 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0     0
> >>           raidz1    ONLINE       0     0     0
> >>             da2     ONLINE       0     0     0
> >>             da3     ONLINE       0     0     0
> >>             da4     ONLINE       0     0     0
> >>             da5     ONLINE       0     0     0
> >>             da6     ONLINE       0     0     0
> >>             da7     ONLINE       0     0     0
> >>             da8     ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# ls
> >>    random.dat.1
> >>    write# cp random.dat.1 random.dat.2
> >>    write# cp random.dat.1 random.dat.3
> >>    write# cp random.dat.1 random.dat.4
> >>    write# cp random.dat.1 random.dat.5
> >>    write# cp random.dat.1 random.dat.6
> >>    write# cp random.dat.1 random.dat.7
> >>    write# md5 *
> >>    MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>
> >>What is also odd, is if I create 7 separate ZFS volumes, they do not
> >>report any data corruption:
> >>
> >>    write# zpool destroy filevol001
> >>    write# zpool create test01 da2
> >>    write# zpool create test02 da3
> >>    write# zpool create test03 da4
> >>    write# zpool create test04 da5
> >>    write# zpool create test05 da6
> >>    write# zpool create test06 da7
> >>    write# zpool create test07 da8
> >>    write# zpool status
> >>       pool: test01
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test01      ONLINE       0     0     0
> >>           da2       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test02
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test02      ONLINE       0     0     0
> >>           da3       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test03
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test03      ONLINE       0     0     0
> >>           da4       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test04
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test04      ONLINE       0     0     0
> >>           da5       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test05
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test05      ONLINE       0     0     0
> >>           da6       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test06
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test06      ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test07
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test07      ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000
> >>    1000+0 records in
> >>    1000+0 records out
> >>    1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec)
> >>    write# cd /tmp/
> >>    write# md5 /tmp/random.dat.1
> >>    MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp
> >>    random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1
> >>    /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07
> >>    write# md5 /test*/*
> >>    MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03
> >>    ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ;
> >>    zpool scrub test07
> >>    write# zpool status
> >>       pool: test01
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:49 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test01      ONLINE       0     0     0
> >>           da2       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test02
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:52 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test02      ONLINE       0     0     0
> >>           da3       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test03
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:54 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test03      ONLINE       0     0     0
> >>           da4       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test04
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:57 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test04      ONLINE       0     0     0
> >>           da5       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test05
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:28:00 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test05      ONLINE       0     0     0
> >>           da6       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test06
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:28:02 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test06      ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test07
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:28:05 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test07      ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>Based on these results, I've drawn the following conclusion:
> >>* ZFS single pool per device = OKAY
> >>* ZFS raidz of all devices = OKAY
> >>* ZFS stripe of all devices = NOT OKAY
> >>
> >>The results are immediate, and I know ZFS will self-heal, so is that
> >>what it is doing behind my back and just not reporting it? Is this a
> >>ZFS bug with striping vs. raidz?
> >Can you reproduce this problem using RELENG_8?  Please try one of the
> >below snapshots.
> >
> >ftp://BLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/
> >
> The server is in a data center with limited access control, do I
> have to option of using a particular CVS tag (checking out via csup)
> and then perform a make world/kernel?

Doing this is more painful than, say, downloading a livefs image and
seeing if you can reproduce the problem (e.g. you won't be modifying
your existing OS installation), especially since I can't guarantee that
the problem you're seeing is fixed in RELENG_8 (hence my request to
begin with).  But if you can't boot livefs, then here you go:

You'll need some form of console access (either serial or VGA) to do the
upgrade reliably.  "Rolling back" may also not be an option since
RELENG_8 is newer than RELENG_8_1 and may have introduced some new
binaries or executables into the fray.  If you don't have console access
to this machine, if things go awry you may be SOL.  The vagueness of my
statement is intentional; I can't cover every situation that might come
to light.

Please be sure to back up your kernel configuration file before doing
the following, and make sure that the supfile shown below has
tag=RELENG_8 in it (it should).  And yes, the rm commands below are
recommended; failure to use them could result in some oddities given
that your /usr/src tree refers to RELENG_8_1 version numbers which
differ from RELENG_8.  You *do not* have to do this for ports (since for
ports, tag=. is used by default).

rm -fr /var/db/sup/src-all
rm -fr /usr/src/*
rm -fr /usr/obj/*
csup -h cvsupserver -L 2 /usr/share/examples/cvsup/stable-supfile

At this point you can restore your kernel configuration file to the
appropriate place (/sys/i386/conf, /sys/amd64/conf, etc.) and build
world/kernel as per the instructions in /usr/src/Makefile (see lines
~51-62).  ***Please do not skip any of the steps***.  Good luck.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |