From owner-freebsd-fs@FreeBSD.ORG Sat Apr 14 04:06:39 2007 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9FB0D16A404 for ; Sat, 14 Apr 2007 04:06:39 +0000 (UTC) (envelope-from bp@barryp.org) Received: from eden.barryp.org (host-42-60-230-24.midco.net [24.230.60.42]) by mx1.freebsd.org (Postfix) with ESMTP id 474D313C4BE for ; Sat, 14 Apr 2007 04:06:39 +0000 (UTC) (envelope-from bp@barryp.org) Received: from [10.66.1.10] (helo=barry-pedersons-computer.local) by eden.barryp.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63 (FreeBSD)) (envelope-from ) id 1HcZX3-000MKV-W6 for freebsd-fs@FreeBSD.org; Fri, 13 Apr 2007 23:06:38 -0500 Message-ID: <46205338.3090803@barryp.org> Date: Fri, 13 Apr 2007 23:06:16 -0500 From: Barry Pederson User-Agent: Thunderbird 2.0b2 (Macintosh/20070116) MIME-Version: 1.0 To: freebsd-fs@FreeBSD.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: ZFS raidz device replacement problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Apr 2007 04:06:39 -0000 I've been playing with ZFS (awesome stuff, thanks PJD) and noticed something funny when replacing a device under a raidz pool. It seems that even though ZFS says resilvering is complete, you still need to manually do a "zpool scrub" to really get the pool into a good state. From what I've read in the "Solaris ZFS Administration Guide", it doesn't seem that that step should be required. Is there some kind of auto-scrub being missed? I've tried to show this below with some md devices, creating 4 of them, putting 3 into a raidz and then replacing one. This is with a world and kernel csupped and built earlier today (2007-04-13) ------------------- # truncate -s 128m /tmp/foo0 # truncate -s 128m /tmp/foo1 # truncate -s 128m /tmp/foo2 # truncate -s 128m /tmp/foo3 # mdconfig -a -t vnode -f /tmp/foo0 md0 # mdconfig -a -t vnode -f /tmp/foo1 md1 # mdconfig -a -t vnode -f /tmp/foo2 md2 # mdconfig -a -t vnode -f /tmp/foo3 md3 # zpool create mypool raidz md0 md1 md2 # zpool status mypool pool: mypool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 md0 ONLINE 0 0 0 md1 ONLINE 0 0 0 md2 ONLINE 0 0 0 errors: No known data errors # zpool replace mypool md2 md3 # zpool status mypool pool: mypool state: ONLINE scrub: resilver completed with 0 errors on Fri Apr 13 22:43:19 2007 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 md0 ONLINE 0 0 0 md1 ONLINE 0 0 0 replacing ONLINE 0 0 0 md2 ONLINE 0 0 0 md3 ONLINE 0 0 0 errors: No known data errors # zpool status mypool pool: mypool state: ONLINE scrub: resilver completed with 0 errors on Fri Apr 13 22:43:19 2007 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 md0 ONLINE 0 0 0 md1 ONLINE 0 0 0 md3 ONLINE 0 0 0 errors: No known data errors # zpool scrub mypool # zpool status mypool pool: mypool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed with 0 errors on Fri Apr 13 22:43:46 2007 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 md0 ONLINE 0 0 0 md1 ONLINE 0 0 0 md3 ONLINE 0 0 5 errors: No known data errors -------------------------------- Barry