From owner-freebsd-current@FreeBSD.ORG Fri Oct 12 02:38:18 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C19C216A469 for ; Fri, 12 Oct 2007 02:38:18 +0000 (UTC) (envelope-from spawk@acm.poly.edu) Received: from acm.poly.edu (acm.poly.edu [128.238.9.200]) by mx1.freebsd.org (Postfix) with ESMTP id 5DA2B13C457 for ; Fri, 12 Oct 2007 02:38:18 +0000 (UTC) (envelope-from spawk@acm.poly.edu) Received: (qmail 33002 invoked from network); 12 Oct 2007 02:33:13 -0000 Received: from unknown (HELO ?192.168.0.2?) (spawk@69.123.41.145) by acm.poly.edu with AES256-SHA encrypted SMTP; 12 Oct 2007 02:33:13 -0000 Message-ID: <470EDE0E.8070800@acm.poly.edu> Date: Thu, 11 Oct 2007 22:38:06 -0400 From: Boris Kochergin User-Agent: Thunderbird 2.0.0.0 (X11/20070609) MIME-Version: 1.0 To: freebsd-current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: ZFS raidz1 redundancy X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Oct 2007 02:38:18 -0000 Hi. I'm running running an i386 -CURRENT built on October 2nd. I have a raidz1 pool consisting of seven 400-GiB PATA disks. ad4 and ad5 are part of the pool. This afternoon, the following happened: Oct 11 19:05:27 exodus kernel: ad4: timeout waiting to issue command Oct 11 19:05:27 exodus kernel: ad4: error issuing READ_DMA command Oct 11 19:05:27 exodus root: ZFS: vdev I/O failure, zpool=home path=/dev/ad4 offset=70362711040 size=21504 error=5 The machine proceeded to panic after that, and when it rebooted, the following happened after a while: Oct 11 19:11:40 exodus kernel: ad5: detached Oct 11 19:11:40 exodus kernel: ad4: TIMEOUT - READ_DMA retrying (1 retry left) LBA=32 It crashed again half an hour after that, and when it came back up, ad4 was no longer detected by the ATA controller. The output of "zpool status" is as follows: pool: home state: FAULTED status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: none requested config: NAME STATE READ WRITE CKSUM home FAULTED 6 0 0 corrupted data raidz1 DEGRADED 6 0 0 ad4 UNAVAIL 0 0 0 cannot open ad5 ONLINE 0 0 0 ad10 ONLINE 0 0 0 ad11 ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad9 ONLINE 0 0 0 ad6 ONLINE 0 0 0 Is it possible that the data on ad5, in the midst of the failured of ad4, has become inconsistent with the other members of the pool, and that I need to bring ad4 online (I'm fairly sure that it's a motherboard- or power-related issue and that the drive is OK) to be able to access the data on the pool? -Boris