From owner-freebsd-current@FreeBSD.ORG Tue Mar 31 09:11:53 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C2D91065670 for ; Tue, 31 Mar 2009 09:11:53 +0000 (UTC) (envelope-from M.S.Powell@salford.ac.uk) Received: from airy.salford.ac.uk (airy.salford.ac.uk [146.87.0.11]) by mx1.freebsd.org (Postfix) with SMTP id 78FC08FC12 for ; Tue, 31 Mar 2009 09:11:52 +0000 (UTC) (envelope-from M.S.Powell@salford.ac.uk) Received: (qmail 13166 invoked by uid 98); 31 Mar 2009 10:11:51 +0100 Received: from 146.87.255.121 by airy.salford.ac.uk (envelope-from , uid 401) with qmail-scanner-2.01 (clamdscan: 0.94.2/9185. spamassassin: 3.2.4. Clear:RC:1(146.87.255.121):. Processed in 0.038823 secs); 31 Mar 2009 09:11:51 -0000 Received: from rust.salford.ac.uk (HELO rust.salford.ac.uk) (146.87.255.121) by airy.salford.ac.uk (qpsmtpd/0.3x.614) with SMTP; Tue, 31 Mar 2009 10:11:50 +0100 Received: (qmail 48415 invoked by uid 1002); 31 Mar 2009 09:11:48 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 31 Mar 2009 09:11:48 -0000 Date: Tue, 31 Mar 2009 10:11:48 +0100 (BST) From: "Mark Powell" To: Thomas Sparrevohn In-Reply-To: <200903301745.46149.Thomas.Sparrevohn@btinternet.com> Message-ID: <20090331100328.H46640@rust.salford.ac.uk> References: <49BD117B.2080706@163.com> <20090325090456.G92412@rust.salford.ac.uk> <49C9FC53.8070104@163.com> <200903301745.46149.Thomas.Sparrevohn@btinternet.com> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1793810244-1238490296=:46640" Content-ID: <20090331100726.K46640@rust.salford.ac.uk> Cc: kevin , freebsd-current@freebsd.org Subject: Re: ZFS data error without reasons X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2009 09:11:53 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1793810244-1238490296=:46640 Content-Type: TEXT/PLAIN; CHARSET=UTF-8; FORMAT=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Content-ID: <20090331100726.Y46640@rust.salford.ac.uk> On Mon, 30 Mar 2009, Thomas Sparrevohn wrote: > On Wednesday 25 March 2009 09:41:39 kevin wrote: >> Mark Powell wrote: >>> Kevin, >>> Did you fix your ZFS CRC errors? >>> I responded to your thread, but no-one got back to me. >>> I'm gonna start another thread later. >>> This time I re-made the zpool in 8 compatible with 7. Once the errors >>> started showing up in 8 I moved back to 7, on the same hardware, to >>> perform the scrub to prove the problem is with 8. The 1st scrub in 7 >>> found some errors, but of course it would if 8 had messed up the data. >>> Removed the few unimportant bad files (all were in snapshots). >>> Just performing the 2nd scrub in 7 now. If this comes back with no >>> errors, then we have stronger proof that there is some wrong, which >>> seems quite intermittent, in 8 that randomly writes bad data. >>> Cheers. >>> >> Yes=EF=BC=8CI can fix some ZFS CRC errors,and sometimes i can recover al= l error >> files.Before i run "zpool import backup" to mount the zpool on a usb >> hard disk, "zpool status" return no errors. When i copy files to the usb >> hard disk,soon I can get lots of file errors.After a reboot,if i run >> scrub,i can fix many errors. I just think copy files between two zpools, >> one is on local hard disk and the other one is on a usb hard disk, may >> easily reproduce the bug. I didn't write that! > I have not been folloing the entire thread - but I can reproduce ZFS CRC= =20 > corruption on the current kernel just by unpluging a USB disk drive -=20 > The is no errors on the disks - revert to and old kernel FreeBSD=20 > w2fzz0vc03.aah-go-on.com 8.0-CURRENT FreeBSD 8.0-CURRENT #1 r189454M:=20 > Fri Mar 6 18:46:25 GMT 2009=20 > root@w2fzz0vc03.aah-go-on.com:/usr/obj/usr/src/sys/GENERIC amd64 > > the problem can be solved - The weird thing is that it will give CRC=20 > errros (and permenent errors) in blocks that has not been touched (or at= =20 > least I think so) Can you be a little clearer? Perhaps some zpool status output with the=20 steps you've taken? > I suspect that It may have to do with the USB DMA bounce buffer as an=20 > example see the message file included I expect this is a red hering, but do you not have some kind of=20 kernel/module sync problem? Mar 26 13:48:18 w2fzz0vc03 root: /etc/rc: WARNING: Unable to load kernel mo= dule daemon_saver Mar 26 13:48:18 w2fzz0vc03 kernel: KLD daemon_saver.ko: depends on kernel -= not available Cheers. --=20 Mark Powell - UNIX System Administrator - The University of Salford Information & Learning Services, Clifford Whitworth Building, Salford University, Manchester, M5 4WT, UK. Tel: +44 161 295 6843 Fax: +44 161 295 5888 www.pgp.com for PGP key --0-1793810244-1238490296=:46640--