From owner-freebsd-fs@FreeBSD.ORG Mon Jan 7 03:09:31 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4590316A417 for ; Mon, 7 Jan 2008 03:09:31 +0000 (UTC) (envelope-from tzhuan@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.155]) by mx1.freebsd.org (Postfix) with ESMTP id C447B13C46A for ; Mon, 7 Jan 2008 03:09:30 +0000 (UTC) (envelope-from tzhuan@gmail.com) Received: by fg-out-1718.google.com with SMTP id 16so4940216fgg.35 for ; Sun, 06 Jan 2008 19:09:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; bh=AJU/9jyX/gdWGxgfbQLuSDUcQ9NRDemHwd3aAuwHX+I=; b=GK8A281gaSUHpZp9LE/uTAXAxuoL5XWiCsE8boyMk2mInPbsqQ7NN4txLAtrozuG00traxtOnW0ysc8i9Ra5PaAFiAZ5aBeCaQbxjU1MnY83bed3WPKqprZJoOvYmZZx+FXeQAC/QHETx+r/Lx2ggvQRIXm4cG71ylfOi07mchs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=toWCOy8gMSCL8gp2Cp3dyDrSB5UVCpvCFl2RoHDcwyiHRbZHazBkZYQCnx7lRxRYNFXQh0X74mfKOPP5CnjxJ5Us5HzVDiar4Z+SKm9pH7o/m2dopMfBxXPJJfAmh+VSNHXjJzYxFT5enWk78GVQZxG9opoi3REVkTwieiPqVYs= Received: by 10.86.77.5 with SMTP id z5mr6664451fga.41.1199673853692; Sun, 06 Jan 2008 18:44:13 -0800 (PST) Received: by 10.86.79.20 with HTTP; Sun, 6 Jan 2008 18:44:13 -0800 (PST) Message-ID: <6a7033710801061844m59f8c62dvdd3eea80f6c239c1@mail.gmail.com> Date: Mon, 7 Jan 2008 10:44:13 +0800 From: "Tz-Huan Huang" Sender: tzhuan@gmail.com To: "Brooks Davis" In-Reply-To: <20080103171825.GA28361@lor.one-eyed-alien.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <477B16BB.8070104@freebsd.org> <20080102070146.GH49874@cicely12.cicely.de> <477B8440.1020501@freebsd.org> <200801031750.31035.peter.schuller@infidyne.com> <477D16EE.6070804@freebsd.org> <20080103171825.GA28361@lor.one-eyed-alien.net> X-Google-Sender-Auth: 9bfff906d4a5e24d Cc: freebsd-fs@freebsd.org Subject: Re: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jan 2008 03:09:31 -0000 2008/1/4, Brooks Davis : > > We've definitely seen cases where hardware changes fixed ZFS checksum errors. > In once case, a firmware upgrade on the raid controller fixed it. In another > case, we'd been connecting to an external array with a SCSI card that didn't > have a PCI bracket and the errors went away when the replacement one arrived > and was installed. The fact that there were significant errors caught by ZFS > was quite disturbing since we wouldn't have found them with UFS. Hi, We have a nfs server using zfs with the similar problem. The box is i386 7.0-PRERELEASE with 3G ram: # uname -a FreeBSD cml3 7.0-PRERELEASE FreeBSD 7.0-PRERELEASE #2: Sat Jan 5 14:42:41 CST 2008 root@cml3:/usr/obj/usr/src/sys/CML2 i386 The zfs pool contains 3 raids now: 2007-11-20.11:49:17 zpool create pool /dev/label/proware263 2007-11-20.11:53:31 zfs create pool/project ... (zfs create other filesystems) ... 2007-11-20.11:54:32 zfs set atime=off pool 2007-12-08.22:59:15 zpool add pool /dev/da0 2008-01-05.21:20:03 zpool add pool /dev/label/proware262 After a power loss yesterday, the zfs status shows # zpool status -v pool: pool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed with 231 errors on Mon Jan 7 08:05:35 2008 config: NAME STATE READ WRITE CKSUM pool ONLINE 0 0 516 label/proware263 ONLINE 0 0 231 da0 ONLINE 0 0 285 label/proware262 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /system/database/mysql/flickr_geo/flickr_raw_tag.MYI pool/project:<0x0> pool/home/master/96:<0xbf36> The main problem is that we cannot mount pool/project any more: # zfs mount pool/project cannot mount 'pool/project': Input/output error # grep ZFS /var/log/messages Jan 7 10:08:35 cml3 root: ZFS: zpool I/O failure, zpool=pool error=86 (repeat many times) There are many data in pool/project, probably 3.24T. zdb shows # zdb pool ... Dataset pool/project [ZPL], ID 33, cr_txg 57, 3.24T, 22267231 objects ... (zdb is still running now, we can provide the output if helpful) Is there any way to recover any data from pool/project? Thank you very much. Sincerely, Tz-Huan