From owner-freebsd-questions@FreeBSD.ORG Mon Jun 13 16:22:47 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 762BC106564A for ; Mon, 13 Jun 2011 16:22:47 +0000 (UTC) (envelope-from howie@thingy.com) Received: from post1.inband.network-i.net (tobago.network-i.net [212.21.96.30]) by mx1.freebsd.org (Postfix) with SMTP id E5DA68FC08 for ; Mon, 13 Jun 2011 16:22:46 +0000 (UTC) Received: (qmail 41404 invoked from network); 13 Jun 2011 15:50:54 -0000 Received: from unknown (HELO ?10.1.1.188?) (212.21.99.52) by post2.inband.network-i.net with SMTP; 13 Jun 2011 15:50:54 -0000 Message-ID: <4DF63314.3000807@thingy.com> Date: Mon, 13 Jun 2011 16:56:04 +0100 From: Howard Jones User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-questions@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: ZFS on 8.1 - various problems after a disk failure. X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jun 2011 16:22:47 -0000 I have a FreeBSD 8.2 server at home with 4 2TB drives in it running ZFS with a raidz pool. Some time ago, I had a disk fail. Initially it wasn't totally obvious the disk had failed so I ran a 'zpool scrub' on the pool, which threw up a lot of errors, and also produced a lot of sense errors, making it obvious I had a dead disk. I replaced the disk, then ran "zpool replace zjumbo ad4 ad4" to replace the bad disk in-place, and start a resilver. Now I have a few problems: 1) The old ad4 is still listed, even after several scrub/resilvers. Shouldn't it go away? 2) Although I lost a whole directory with ~1TB of music, the space allocated to that directory is still around according df. 3) I have another bunch of files that appear in directory listings, but if I get "Illegal byte sequence" errors when trying to read them (with anything - du, file, wc). I have backups of most of the stuff on the pool (although it'd be nice to recover the more recent data), but how do I get out of this situation without nuking the site from orbit? (my current plan) Firstly, to get a reliable representation of what's actually on the filesystem, and for bonus points, getting back some of the data that should be intact (only one disk in the set was actually bad, right?). Here's my current zpool status. Thanks in advance for any pointers! Howie # zpool status pool: zjumbo state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver completed after 10h57m with 15190 errors on Thu May 19 09:26:59 2011 config: NAME STATE READ WRITE CKSUM zjumbo DEGRADED 0 0 199K raidz1 DEGRADED 0 0 792K replacing DEGRADED 0 0 0 ad4/old UNAVAIL 0 16.1M 0 cannot open ad4 ONLINE 0 0 0 1.15T resilvered ad6 ONLINE 0 0 0 677M resilvered ad8 ONLINE 0 0 0 660M resilvered ad10 ONLINE 0 0 0 535M resilvered errors: 15190 data errors, use '-v' for a list