From owner-freebsd-stable@FreeBSD.ORG Mon May 30 17:19:20 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 622E1106566B for ; Mon, 30 May 2011 17:19:20 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: from email2.allantgroup.com (email2.emsphone.com [199.67.51.116]) by mx1.freebsd.org (Postfix) with ESMTP id 0E3B48FC0C for ; Mon, 30 May 2011 17:19:19 +0000 (UTC) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by email2.allantgroup.com (8.14.4/8.14.4) with ESMTP id p4UHJB3u012171 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 30 May 2011 12:19:11 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: from dan.emsphone.com (smmsp@localhost [127.0.0.1]) by dan.emsphone.com (8.14.5/8.14.4) with ESMTP id p4UHJAH7005026 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 30 May 2011 12:19:10 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.14.5/8.14.5/Submit) id p4UHJABN005025; Mon, 30 May 2011 12:19:10 -0500 (CDT) (envelope-from dan) Date: Mon, 30 May 2011 12:19:10 -0500 From: Dan Nelson To: Olaf Seibert Message-ID: <20110530171909.GE6688@dan.emsphone.com> References: <20110530093546.GX6733@twoquid.cs.ru.nl> <20110530101051.GA49825@twoquid.cs.ru.nl> <20110530103349.GA73825@icarus.home.lan> <20110530110946.GC6733@twoquid.cs.ru.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110530110946.GC6733@twoquid.cs.ru.nl> X-OS: FreeBSD 8.2-STABLE User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.97 at email2.allantgroup.com X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.6 (email2.allantgroup.com [199.67.51.78]); Mon, 30 May 2011 12:19:11 -0500 (CDT) X-Scanned-By: MIMEDefang 2.68 on 199.67.51.78 Cc: freebsd-stable@freebsd.org, Jeremy Chadwick Subject: Re: ZFS I/O errors X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 May 2011 17:19:20 -0000 In the last episode (May 30), Olaf Seibert said: > On Mon 30 May 2011 at 03:33:49 -0700, Jeremy Chadwick wrote: > > On Mon, May 30, 2011 at 12:10:51PM +0200, Olaf Seibert wrote: > > I'm not sure why this didn't actually map to a filename on the system > > however. I've never quite understood what the hexadecimal values shown > > represent (I have ideas but it'd be useful to know what they meant). > > The scrub is starting to add some filenames to the list. So far they are > two filenames in snapshots (where current versions of the file have been > modified since then). > > > Try running without compression and see if that improves things. > > That sounds like a good idea. > > My theory so far is that it ran out of memory while compressing, with > incorrect compressed data written to the disk. The ZFS compression code will panic if it can't allocate the buffer needed to store the compressed data, so that's unlikely to be your problem. The only time I have seen an "illegal byte sequence" error was when trying to copy raw disk images containing ZFS pools to different disks, and the destination disk was a different size than the original. I wasn't even able to import the pool in that case, though. The zfs IO code overloads the EILSEQ error code and uses it as a "checksum error" code. Returning that error for the same block on all disks is definitely weird. Could you have run a partitioning tool, or some other program that would have done direct writes to all of your component disks? Your scrub is also a bit worrying - 24k checksum errors definitely shouldn't occur during normal usage. -- Dan Nelson dnelson@allantgroup.com