From owner-freebsd-fs@FreeBSD.ORG Fri Aug 31 21:05:07 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9146316A420; Fri, 31 Aug 2007 21:05:07 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from solow.pil.dk (relay.pil.dk [195.41.47.164]) by mx1.freebsd.org (Postfix) with ESMTP id 585B013C458; Fri, 31 Aug 2007 21:05:07 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from coruscant.local (naboo.binarysolutions.dk [80.196.17.173]) by solow.pil.dk (Postfix) with ESMTP id B3C8E1CC117; Fri, 31 Aug 2007 23:04:38 +0200 (CEST) Received: by coruscant.local (Postfix, from userid 502) id 04BBF5D656A; Fri, 31 Aug 2007 23:04:37 +0200 (CEST) To: Pawel Jakub Dawidek References: <20070820112946.GC16977@garage.freebsd.pl> From: Kenneth Vestergaard Schmidt Date: Fri, 31 Aug 2007 23:04:37 +0200 In-Reply-To: (Kenneth Vestergaard Schmidt's message of "Mon\, 20 Aug 2007 14\:20\:33 +0200") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: 'checksum mismatch' all over the place X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Aug 2007 21:05:07 -0000 Kenneth Vestergaard Schmidt writes: >> How do you know it was fine? Did you have something that did >> checksumming? You could try geli with integrity verification feature >> turned on, fill the disks with some random data and then read it back, >> if your controller corrupts the data, geli should tell you this. > > I may have to do this. The previous drive was almost filled to the brim > with data, which rsync looked at each day, and we didn't have a lot of > re-transfer, but that doesn't necessarily mean anything. *blush* This turned out to be a firmware-issue with the Eonstor RAID-enclosure. After upgrading to v3.47, everything is fine in the checksum-department. Now, however, I can't seem to keep the box running. We've rsync'd 1.56 TB data to an 8.18 TB raidz2 pool, and we're getting panics all the time. It's an x86 with 4 GB RAM. I've got the following in /boot/loader.conf: vfs.zfs.prefetch_disable="1" vfs.zfs.arc_max="107772160" vm.kmem_size_max="629145600" vm.kmem_size_min="629145600" and kern.maxvnodes is set to 50000. When the machine is finished booting, 'vmstat -m' says: Type InUse MemUse HighUse Requests Size(s) solaris 49972 158199K - 455307 16,32,64,128,256,512,1024,2048,4096 and after about an hours worth of rsync'ing, we get: Type InUse MemUse HighUse Requests Size(s) solaris 198797 449675K - 404226785 16,32,64,128,256,512,1024,2048,4096 panic: kmem_malloc(28672): kmem_map too small: 614682624 total allocated I'm not quite sure what knobs to twiddle with, or what values to watch, so any help in this department would be much appreciated. I'm sure it'd be nice to update the Wiki, too, with that info, since the values there don't make things stable. -- Kenneth Schmidt pil.dk