From owner-freebsd-stable@freebsd.org Wed May 8 03:10:06 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0BAF6159B569 for ; Wed, 8 May 2019 03:10:06 +0000 (UTC) (envelope-from walterp@gmail.com) Received: from mail-it1-x135.google.com (mail-it1-x135.google.com [IPv6:2607:f8b0:4864:20::135]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CD2CF88447 for ; Wed, 8 May 2019 03:10:03 +0000 (UTC) (envelope-from walterp@gmail.com) Received: by mail-it1-x135.google.com with SMTP id l7so1665975ite.2 for ; Tue, 07 May 2019 20:10:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=xduG+Xx+T06w6jIF/DSEtioJ1dQz9lF9zXiBKSp4yd4=; b=gxKG9YpILdMKIwTdNsTb2tkhylUOWhwZUetrvLFAnOPRg6GIKEy+AJil81GM6+AXln 9sO+/RFyor1qypv6GPw5qwtkiAK5GXzWcfLA3eJ7fYVyjrcizu/GWHccNfJIXT+r+C+s rUH7NoPfIagpHsHHm15EoGGonTQbH37FRFS715z6QGvqFuZyFpuVmcjmkSNKV7/lxHxJ W6892J3wIZnzmXrbNo4M/sPt0pcFaiVDzNMoLxJNQCO5RPTDnSwPDR59lndWVakpv+XS bDMoWjvAAwhLqyJqOIp3RtII2MngBN8HX3y+xFr6H1wCWgURoyBk+DoRYpFud717xGUc 0tPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=xduG+Xx+T06w6jIF/DSEtioJ1dQz9lF9zXiBKSp4yd4=; b=c1CC6JIGJiwsV/ppQGWNbMFfM1rqbD9y3iUOwppX6lt9NsSjqXLxk8GrnCYOJJd1ht GwI749+v34j4vkGeUkfEXdVP+8UdnoEUOjV0W9LcIotRe9Hp3y4JkqakqqP8efegtlri 8uuY0U58qgRsrqYNL2L6qq3q+VvpJ1YHntcVRWhoAXJCIZDbXA3X8gm0GepZ97VrFNvV 1J9OjI5YO6N+KTYNG3dbY4xqXZ8WuQ/gAPFGAqltO1x1qMZDg8yAlN9XFd20K/k85BtL phd/sCZ9fY8ULWoNYrOG2c3twV0GrGLtdKlKW+f33JE9Ecnk+Fua7U8Rg82UPcFZQzqa gUmg== X-Gm-Message-State: APjAAAVJRqeEiJfFwsF6vsUHgFof3IeqBjCfBjO1DmHlKdPPNrcDw5Jo 89rdIRaVqUECIT2O3h8/5Eu90SRfnxMb5DCBpwgLup/q X-Google-Smtp-Source: APXvYqzRKa9fBbFg6rAxUXKZV+MYD7Wq1ARCIlh6ZhEZf+GwLZuFQVhkLigewrECJ1OZxjQNJK2NExcOlXBlflTHELk= X-Received: by 2002:a02:b88b:: with SMTP id p11mr25567496jam.82.1557285002656; Tue, 07 May 2019 20:10:02 -0700 (PDT) MIME-Version: 1.0 References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> <20190430102024.E84286@mulder.mintsol.com> <41FA461B-40AE-4D34-B280-214B5C5868B5@punkt.de> <20190506080804.Y87441@mulder.mintsol.com> <08E46EBF-154F-4670-B411-482DCE6F395D@sorbs.net> <33D7EFC4-5C15-4FE0-970B-E6034EF80BEF@gromit.dlib.vt.edu> In-Reply-To: From: Walter Parker Date: Tue, 7 May 2019 20:09:49 -0700 Message-ID: Subject: Re: ZFS... To: freebsd-stable@freebsd.org X-Rspamd-Queue-Id: CD2CF88447 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=gxKG9YpI; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of walterp@gmail.com designates 2607:f8b0:4864:20::135 as permitted sender) smtp.mailfrom=walterp@gmail.com X-Spamd-Result: default: False [-6.09 / 15.00]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_NONE(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; SUBJ_ALL_CAPS(0.45)[6]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+,1:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_SHORT(-0.75)[-0.753,0]; IP_SCORE(-2.78)[ip: (-8.37), ipnet: 2607:f8b0::/32(-3.22), asn: 15169(-2.26), country: US(-0.06)]; RCVD_IN_DNSWL_NONE(0.00)[5.3.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; RCVD_COUNT_TWO(0.00)[2] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 May 2019 03:10:06 -0000 > > > Everytime I have seen this issue (and it's been more than once - though > until now recoverable - even if extremely painful) - its always been > during a resilver of a failed drive and something happening... panic, > another drive failure, power etc.. any other time its rock solid... > which is the yes and no... under normal circumstances zfs is very very > good and seems as safe as or safer than UFS... but my experience is ZFS > has one really bad flaw.. if there is a corruption in the metadata - > even if the stored data is 100% correct - it will fault the pool and > thats it it's gone barring some luck and painful recovery (backups > aside) ... this other file systems also suffer but there are tools that > *majority of the time* will get you out of the s**t with little pain. > Barring this windows based tool I haven't been able to run yet, zfs > appears to have nothing. > > > This is the difference I see here. You keep says that all of the data drive is 100% correct, that is only the meta data on the drive that is incorrect/corrupted. How do you know this? Especially, how to you know before you recovered the data from the drive. As ZFS meta data is stored redundantly on the drive and never in an inconsistent form (that is what fsck does, it fixes the inconsistent data that most other filesystems store when they crash/have disk issues). If the meta data is corrupted, how would ZFS know what other correct (computers don't understand things, they just follow the numbers)? If the redundant copies of the meta data are corrupt, what are the odds that the file data is corrupt? In my experience, getting the meta data trashed and none of the file data trashed is a rare event on a system with multi-drive redundancy. I have a friend/business partner that doesn't want to move to ZFS because his recovery method is wait for a single drive (no-redundancy, sometimes no backup) to fail and then use ddrescue to image the broken drive to a new drive (ignoring any file corruption because you can't really tell without ZFS). He's been using disk rescue programs for so long that he will not move to ZFS, because it doesn't have a disk rescue program. He has systems on Linux with ext3 and no mirroring or backups. I've asked about moving them to a mirrored ZFS system and he has told me that the customer doesn't want to pay for a second drive (but will pay for hours of his time to fix the problem when it happens). You kind of sound like him. ZFS is risky because there isn't a good drive rescue program. Sun's design was that the system should be redundant by default and checksum everything. If the drives fail, replace them. If they fail too much or too fast, restore from backup. Once the system had too much corruption, you can't recover/check for all the damage without a second off disk copy. If you have that off disk, then you have backup. They didn't build for the standard use case as found in PCs because the disk recover programs rarely get everything back, therefore they can't be relied on to get you data back when your data is important. Many PC owners have brought PC mindset ideas to the "UNIX" world. Sun's history predates Windows and Mac and comes from a Mini/Mainframe mindset (were people tried not to guess about data integrity). Would a disk rescue program for ZFS be a good idea? Sure. Should the lack of a disk recovery program stop you from using ZFS? No. If you think so, I suggest that you have your data integrity priorities in the wrong order (focusing on small, rare events rather than the common base case). Walter -- The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well-meaning but without understanding. -- Justice Louis D. Brandeis