From owner-freebsd-fs@freebsd.org Sat Aug 5 17:09:14 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D9B6EDBEA15; Sat, 5 Aug 2017 17:09:14 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (mail.norma.perm.ru [IPv6:2a00:7540:1::5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.norma.perm.ru", Issuer "Vivat-Trade UNIX Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 3A974803B1; Sat, 5 Aug 2017 17:09:13 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from [IPv6:2a02:2698:26:9b0e:d7a:6d28:b07e:5ba6] (dynamic-2a02-2698-26-0-0.perm.ertelecom.ru [IPv6:2a02:2698:26:9b0e:d7a:6d28:b07e:5ba6] (may be forged)) (authenticated bits=0) by elf.hq.norma.perm.ru (8.15.2/8.15.2) with ESMTPSA id v75H98jY007140 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Sat, 5 Aug 2017 22:09:09 +0500 (YEKT) (envelope-from emz@norma.perm.ru) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=norma.perm.ru; s=key; t=1501952949; bh=9rFsB6tx5omdomCj49fVy8H9H3xUovof3EKN1OS6VfE=; h=To:Cc:From:Subject:Date; b=Ehb/WDtHVBwMjIZN6CeA32kP/pn804AMyz1iowH0oyuqiuLu4Jv6imnQQEPQXEJWN beNvNmM/pSdDNoDTYt7H3+390FqVWRXWvuIA7QaqFYJJl599rlfNaZQ/76TxLQH9mD aAeyCEAcPLzD71CWaAOnjmv3QPHjSwYQmkTWOtk4= To: freebsd-stable@FreeBSD.org Cc: freebsd-fs@freebsd.org From: "Eugene M. Zheganin" Subject: a strange and terrible saga of the cursed iSCSI ZFS SAN Message-ID: <1bd10b1e-0583-6f44-297e-3147f6daddc5@norma.perm.ru> Date: Sat, 5 Aug 2017 22:08:54 +0500 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-GB X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Aug 2017 17:09:15 -0000 Hi, I got a problem that I cannot solve just by myself. I have a iSCSI zfs SAN system that crashes, corrupting it's data. I'll be short, and try to describe it's genesis shortly: 1) autumn 2016, SAN is set up, supermicro server, external JBOD, sandisk ssds, several redundant pools, FreeBSD 11.x (probably release, don't really remember - see below). 2) this is working just fine until early spring 2017 3) system starts to crash (various panics): panic: general protection fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=6599069589504 size=81920) panic: page fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=8245779054592 size=8192) panic: page fault panic: page fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=1792100934656 size=46080) 4) we memtested it immidiately, no problems found. 5) we switch sandisks to toshibas, we switch also the server to an identical one, JBOD to an identical one, leaving same cables. 6) crashes don't stop. 7) we found that field engineers physically damaged (sic!) the SATA cables (main one and spare ones), and that 90% of the disks show ICRC SMART errors. 8) we replaced the cable (brand new HP one). 9) ATA SMART errors stopped increasing. 10) crashes continue. 11) we decided that probably when ZFS was moved over damaged cables between JBODs it was somehow damaged too, so now it's panicking because of that. so we wiped the data completely, reinitialized the SAN system and put it back into the production. we even dd'ed each disk with zeroes (!) - just in case. Important note: the data was imported using zfs send from another, stable system that is runing in production in another DC. 12) today we got another panic. btw the pools look now like this: # zpool status -v pool: data state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 62 raidz1-0 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10 ONLINE 0 0 0 da11 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 62 da12 ONLINE 0 0 0 da13 ONLINE 0 0 0 da14 ONLINE 0 0 0 da15 ONLINE 0 0 0 da16 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: data/userdata/worker208:<0x1> pool: userdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 216K mirror-0 ONLINE 0 0 432K gpt/userdata0 ONLINE 0 0 432K gpt/userdata1 ONLINE 0 0 432K errors: Permanent errors have been detected in the following files: userdata/worker36:<0x1> userdata/worker30:<0x1> userdata/worker31:<0x1> userdata/worker35:<0x1> 12) somewhere between p.5 and p.10 the pool became deduplicated (not directly connected to the problem, just for production reasons). So, concluding: we had bad hardware, we replaced EACH piece (server, adapter, JBOD, cable, disks), and crashes just don't stop. We have 5 another iSCSI SAN systems, almost fully identical that don't crash. Crashes on this particular system began when it was running same set of versions that stable systems. So, besides calling an exorcist, I would like to hear what other options do I have, I really would. And I want to also ask - what happens when the system's memory isn't enough for deduplication - does it crash, or does the problem of mounting the pool appear, like some articles mention ? This message could been encumbered by junky data like the exact FreeBSD releases we ran (asuuming it's normal for some 11.x revisions to crash and damage the data, and some - not, which I believe it's a nonsense), by the pool configurations and disk lists (assuming the same - that you can provoque data loss by some redundant pool configurations - not considering raidz with more than 5 disks - which I believe is not true), and so on. but I decided not to include this until requested. And as I also said, we have 5 another SAN systems running similar/identical configurations without major problems. Thanks. Eugene.