From owner-freebsd-current@FreeBSD.ORG Thu Dec 13 04:18:19 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 822DD16A50A for ; Thu, 13 Dec 2007 04:18:19 +0000 (UTC) (envelope-from Benjamin.Close@clearchain.com) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by mx1.freebsd.org (Postfix) with ESMTP id B8CB313C4F5 for ; Thu, 13 Dec 2007 04:18:18 +0000 (UTC) (envelope-from Benjamin.Close@clearchain.com) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aq4HAGc/YEd5LSgsWmdsb2JhbACBWo4JASCBOw X-IronPort-AV: E=Sophos;i="4.24,160,1196602200"; d="scan'208";a="16390890" Received: from ppp121-45-40-44.lns10.adl2.internode.on.net (HELO mail.clearchain.com) ([121.45.40.44]) by ipmail05.adl2.internode.on.net with ESMTP; 13 Dec 2007 14:48:15 +1030 Received: from benjamin-closes-powerbook-g4-12.local (wcl.ml.unisa.edu.au [130.220.166.5]) (authenticated bits=0) by mail.clearchain.com (8.13.8/8.13.8) with ESMTP id lBD4I7VY068878 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 13 Dec 2007 14:48:13 +1030 (CST) (envelope-from Benjamin.Close@clearchain.com) Message-ID: <4760B342.4000301@clearchain.com> Date: Thu, 13 Dec 2007 14:51:22 +1030 From: Benjamin Close User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Kip Macy References: <47606C09.2070209@isc.org> <47609F0A.7010805@clearchain.com> <47609FE3.8040606@barafranca.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV version 0.91.2, clamav-milter version 0.91.2 on pegasus.clearchain.com X-Virus-Status: Clean X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (mail.clearchain.com [192.168.154.1]); Thu, 13 Dec 2007 14:48:14 +1030 (CST) Cc: freebsd-current@freebsd.org, Hugo Silva Subject: Re: ZFS melting under postgres... X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Dec 2007 04:18:19 -0000 Kip Macy wrote: > On Dec 12, 2007 6:58 PM, Hugo Silva wrote: > >> Benjamin Close wrote: >> >>> Peter Losher wrote: >>> >>>> Hi, >>>> >>>> As part of our testing 7.0/ZFS we tried putting it thru it's paces >>>> having ZFS act as our storage medium for some test pgsql db's (like for >>>> sqlgrey, etc) and in both BETA2 and BETA4 (amd64) we get the same >>>> results with a RAIDZ2 container: >>>> >>>> -=- >>>> Dec 12 14:24:12 nsa sqlgrey: fatal: setconfig error at >>>> /usr/local/sbin/sqlgrey line 186. >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad4 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad6 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad8 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad10 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad12 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad14 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad16 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad18 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad4 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad6 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad8 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad10 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad12 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad14 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad16 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad18 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault error=86 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad4 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad6 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad8 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad10 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad12 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad14 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad16 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad18 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa postgres[50527]: [5-1] PANIC: could not write to >>>> log file 2, segment 53 at offset 7864320, length 8192: Input/output >>>> error >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad4 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad6 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad8 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad10 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad12 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad14 offset=3665128448 size=22016 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad16 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault >>>> path=/dev/ad18 offset=3665128448 size=21504 >>>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault error=86 >>>> Dec 12 16:49:53 nsa postgres[50596]: [1-1] FATAL: the database system >>>> is starting up >>>> Dec 12 16:49:53 nsa kernel: pid 50527 (postgres), uid 70: exited on >>>> signal 6 (core dumped) >>>> -=- >>>> >>>> It basically corrupts the container from the inside until it fails >>>> completely (usually withing 24-48 hours depending on how busy the db is) >>>> >>>> I had thought it was a bad SATA replicator/controller, but we had that >>>> replaced w/ one from Supermicro. So it's either the disks, or something >>>> in ZFS. Anyone used ZFS to backend any db's (mysql or pgsql?) >>>> >>>> If you need more info, let me know... >>>> >>>> >>>> >>> Try turning of zil, whilst I don't use a db, I have zfs under high >>> load. I've found without zil turned off I see checksum corruption as >>> well: >>> >>> /boot/loader.conf >>> >>> vfs.zfs.zil_disable=1 >>> >>> Cheers, >>> Benjamin >>> >> Wouldn't it be a bad idea to disable ZIL ? >> >> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29 >> > > Yes. However, FreeBSD suffers from deadlocks under load if ZIL is enabled. > > -Kip > It also comes down to what your doing. ZFS is always consistent on disk. ZIL provides the journal between the last pool transaction write and what has changed since that write. Either way zfs will come up cleanly after a power failure, it's just whether you have those last few sync's or not. For the application I'm using zfs for (rsynced backups, snapshoted daily) that'll be corrected the next day anyway. For a DB, this could be a show stopper. Cheers, Benjamin