From owner-freebsd-fs@FreeBSD.ORG Sun Aug 9 11:47:11 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 968C31065672 for ; Sun, 9 Aug 2009 11:47:11 +0000 (UTC) (envelope-from sub.mesa@gmail.com) Received: from mail-bw0-f206.google.com (mail-bw0-f206.google.com [209.85.218.206]) by mx1.freebsd.org (Postfix) with ESMTP id 123848FC08 for ; Sun, 9 Aug 2009 11:47:10 +0000 (UTC) Received: by bwz2 with SMTP id 2so1562759bwz.43 for ; Sun, 09 Aug 2009 04:47:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=tbR0/pF0W+QwlAFkwLFfBIDI0B1JLWCK1fdy8AeWAYs=; b=VYcgo6QLRIdrkRrFnYSYq2Ed/G0oGeRm0J5MmvVK7lJZeqXI+ZJnt4kpwyIRgZYWvC ickImpis5pahDvt72/ucreNdmbGEqVSP0YY2UG+7TqNtbeTCHZN/udvN2H9QL29hDms0 oDGdzHC0Jyl8wrC7235xOUHWU/vGo2VY+YHdU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=SvSx54fgacDEw3hI7fDDnlh1Pqv6XhfIJYjQdZLHLQdUiE003W3U+ujkbpF9owmD0R gGIzgWmO63ZbT0TiB1XuIt9KUducKJAcJJHiqSSITvw9aq0omMOatZ++ZKvxMNVrb6X4 QehNtK8PDLYOxDXdZZkgV+xIMvcH3W1EE8bgc= MIME-Version: 1.0 Received: by 10.223.124.81 with SMTP id t17mr663634far.21.1249816472899; Sun, 09 Aug 2009 04:14:32 -0700 (PDT) Date: Sun, 9 Aug 2009 13:14:32 +0200 Message-ID: <883b2dc50908090414o71bc5fc2q5aef64c2b5da653e@mail.gmail.com> From: Jason Edwards To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: ZFS corruption on 8-CURRENT X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Aug 2009 11:47:11 -0000 Hi guys, I'm investigating some weird corruption issue. After filling up my 8-disk RAID-Z pool with data and using it for a few weeks, it started to show me this: # zpool status sub pool: sub state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM sub UNAVAIL 0 0 0 insufficient replicas raidz1 UNAVAIL 0 0 0 insufficient replicas ad14a FAULTED 0 0 0 corrupted data ad8a ONLINE 0 0 0 ad10a ONLINE 0 0 0 ad10a FAULTED 0 0 0 corrupted data ad18a FAULTED 0 0 0 corrupted data ad12a FAULTED 0 0 0 corrupted data ad16a FAULTED 0 0 0 corrupted data ad8a FAULTED 0 0 0 corrupted data oops? What happened here? Besides the "corrupted data" it can also be seen ad10a is displayed twice, one online and one failed. After rebooting, it shows a little cleaner, but it found a problem with the ZIL: # zpool status sub pool: sub state: FAULTED status: An intent log record could not be read. Waiting for adminstrator intervention to fix the faulted pool. action: Either restore the affected device(s) and run 'zpool online', or ignore the intent log records by running 'zpool clear'. scrub: none requested config: NAME STATE READ WRITE CKSUM sub FAULTED 0 0 0 bad intent log raidz1 ONLINE 0 0 0 ad14a ONLINE 0 0 0 ad4a ONLINE 0 0 0 ad6a ONLINE 0 0 0 ad10a ONLINE 0 0 0 ad18a ONLINE 6 0 0 ad12a ONLINE 0 0 0 ad16a ONLINE 0 0 0 ad8a ONLINE 0 0 0 Additionally, i got some read errors on ad18. But since this is a raid-z i guess one disk alone cannot corrupt/fail the entire array. Before i do any actions that might be destructive, anybody has a clue what happened here and how i can prevent this in the future? Box is a quadcore X4 9350e with 6GB RAM and its running 8-CURRENT as of July 21th 2009 (after 8.0-BETA2). It did work correctly before upgrading CURRENT to a newer date. Maybe some bug slipped in? Kind regards, sub