From owner-freebsd-stable@freebsd.org Wed May 8 15:55:53 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 745F7158B460 for ; Wed, 8 May 2019 15:55:53 +0000 (UTC) (envelope-from walterp@gmail.com) Received: from mail-it1-x130.google.com (mail-it1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 445FC81E07 for ; Wed, 8 May 2019 15:55:52 +0000 (UTC) (envelope-from walterp@gmail.com) Received: by mail-it1-x130.google.com with SMTP id l7so4742756ite.2 for ; Wed, 08 May 2019 08:55:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=De8YFwOeLhK4arHt/FWLK0b/MuTj7Ahx2zCutEgWB+w=; b=j94n+mEW2jQ8sniAP+SKFgaGyNeykEoF39V7wyHREFihNomwBkvsZaXAS7bnKmVcHF vQaSbdJPxbqFRMBPpsru/M0JWYM8yrNFpRLaEj2Yf4Bscd8idgLSbkbzoxEA1sKG2Ttj L40NUKHxt124sGOtbNc7uCiG8HDZtv44O219XZehlzU1rb3qyDNXm9nqDKCLm2mfmGUC cwKbfy+EmQ5aUf4NdDwWCz88OFZP8ZdHrD0L/jk393QrgSLGJ1Tp4rIxl8Z2sHD8gzHW XH/9hC0W0qI8vXhcjeKbuoDLMau1dxYZYcty+/Si2kBaNkobrl6uE4wmkPiTxajlDPaF C8tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=De8YFwOeLhK4arHt/FWLK0b/MuTj7Ahx2zCutEgWB+w=; b=bw5ibMJSaoAkP8lJ+ydwmy0omC8EiNgdmkDyWBljkq7jGU7Fr9j2vnBg6husS6FwH3 b8QlUEzRczksrLOXzkHf3DGMX0CUw5/7ZS00PqTvAssLq5G8tr685r9AYuGNi32Xlvu8 jTMnJxWNCMxtn+w9FIlpcDB3W6j4n+YFggYvh3P77uVk3oDHrWdGWW6rFq01M93O3SDE Ujk0m7Mn8wljA/oiS8nvzwXneVN8BB5uneTyQ581KUUjamSgMa/9Buk+I18DPGxNJnqn 0Mcc6MOmZlHvypO4jiSNU+z0s+pF3Mj3Ch/5zjSlxz5tzsG5LA2LuileZorAQifAnfLJ Rbag== X-Gm-Message-State: APjAAAWooI4K2gAM63rLFLC8LLoSEdnK45fry/FGsircSRHpSvCW/4F0 h9Vq8xKYdG2GnH4ZPqKtyHqxjpfnAMqmDo61NaZS/pyI X-Google-Smtp-Source: APXvYqzloKj2Oxji73MZ9WyoaZPCdOBJ+FuHpfYS9n0DSU0B473H2pUzps0olzglDVqi/GkJ5SWHuOwd0q3xUZBjirg= X-Received: by 2002:a24:174d:: with SMTP id 74mr2242312ith.22.1557330950913; Wed, 08 May 2019 08:55:50 -0700 (PDT) MIME-Version: 1.0 References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> <20190430102024.E84286@mulder.mintsol.com> <41FA461B-40AE-4D34-B280-214B5C5868B5@punkt.de> <20190506080804.Y87441@mulder.mintsol.com> <08E46EBF-154F-4670-B411-482DCE6F395D@sorbs.net> <33D7EFC4-5C15-4FE0-970B-E6034EF80BEF@gromit.dlib.vt.edu> In-Reply-To: From: Walter Parker Date: Wed, 8 May 2019 08:55:37 -0700 Message-ID: Subject: Re: ZFS... To: freebsd-stable@freebsd.org X-Rspamd-Queue-Id: 445FC81E07 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=j94n+mEW; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of walterp@gmail.com designates 2607:f8b0:4864:20::130 as permitted sender) smtp.mailfrom=walterp@gmail.com X-Spamd-Result: default: False [-6.33 / 15.00]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_NONE(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; SUBJ_ALL_CAPS(0.45)[6]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+,1:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_SHORT(-0.96)[-0.961,0]; IP_SCORE(-2.81)[ip: (-8.52), ipnet: 2607:f8b0::/32(-3.21), asn: 15169(-2.25), country: US(-0.06)]; RCVD_IN_DNSWL_NONE(0.00)[0.3.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; RCVD_COUNT_TWO(0.00)[2] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 May 2019 15:55:53 -0000 > > > ZDB (unless I'm misreading it) is able to find all 34m+ files and > verifies the checksums. The problem is in the zfs data structures (one > definitely, two maybe, metaslabs fail checksums preventing the mounting > (even read-only) of the volumes.) > > > Especially, how to you know > > before you recovered the data from the drive. > See above. > > > As ZFS meta data is stored > > redundantly on the drive and never in an inconsistent form (that is what > > fsck does, it fixes the inconsistent data that most other filesystems > store > > when they crash/have disk issues). > The problem - unless I'm reading zdb incorrectly - is limited to the > structure rather than the data. This fits with the fact the drive was > isolated from user changes when the drive was being resilvered so the > data itself was not being altered .. that said, I am no expert so I > could easily be completely wrong. > > What it sounds like you need is a meta data fixer, not a file recovery tool. Assuming the meta data can be fixed that would be the easy route. That sound not be hard to write if everything else on the disk has no issues. Don't you say in another message that the system is now returning 100's of drive errors. How does that relate the statement =>Everything on the disk is fine except for a little bit of corruption in the freespace map? > > > > > I have a friend/business partner that doesn't want to move to ZFS because > > his recovery method is wait for a single drive (no-redundancy, sometimes > no > > backup) to fail and then use ddrescue to image the broken drive to a new > > drive (ignoring any file corruption because you can't really tell without > > ZFS). He's been using disk rescue programs for so long that he will not > > move to ZFS, because it doesn't have a disk rescue program. > > The first part is rather cavilier .. the second part I kinda > understand... its why I'm now looking at alternatives ... particularly > being bitten as badly as I have with an unmountable volume. > > On the system I managed for him, we had a system with ZFS crap out. I restored it from a backup. I continue to believe that people running systems without backups are living on borrowed time. The idea of relying on a disk recovery tool is too risky for my taste. > > He has systems > > on Linux with ext3 and no mirroring or backups. I've asked about moving > > them to a mirrored ZFS system and he has told me that the customer > doesn't > > want to pay for a second drive (but will pay for hours of his time to fix > > the problem when it happens). You kind of sound like him. > Yeah..no! I'd be having that on a second (mirrored) drive... like most > of my production servers. > > > ZFS is risky > > because there isn't a good drive rescue program. > ZFS is good for some applications. ZFS is good to prevent cosmic ray > issues. ZFS is not good when things go wrong. ZFS doesn't usually go > wrong. Think that about sums it up. > > When it does go wrong I restore from backups. Therefore my systems don't have problems. I sorry you had the perfect trifecta that caused you to lose multiple drives and all your backups at the same time. > > Sun's design was that the > > system should be redundant by default and checksum everything. If the > > drives fail, replace them. If they fail too much or too fast, restore > from > > backup. Once the system had too much corruption, you can't recover/check > > for all the damage without a second off disk copy. If you have that off > > disk, then you have backup. They didn't build for the standard use case > as > > found in PCs because the disk recover programs rarely get everything > back, > > therefore they can't be relied on to get you data back when your data is > > important. Many PC owners have brought PC mindset ideas to the "UNIX" > > world. Sun's history predates Windows and Mac and comes from a > > Mini/Mainframe mindset (were people tried not to guess about data > > integrity). > I came from the days of Sun. > > Good then you should understand Sun's point of view. > > > > Would a disk rescue program for ZFS be a good idea? Sure. Should the lack > > of a disk recovery program stop you from using ZFS? No. If you think so, > I > > suggest that you have your data integrity priorities in the wrong order > > (focusing on small, rare events rather than the common base case). > Common case in your assessment in the email would suggest backups are > not needed unless you have a rare event of a multi-drive failure. Which > I know you're not advocating, but it is this same circular argument... > ZFS is so good it's never wrong we don't need no stinking recovery > tools, oh but take backups if it does fail, but it won't because it's so > good and you have to be running consumer hardware or doing something > wrong or be very unlucky with failures... etc.. round and round we go, > where ever she'll stop no-one knows. > > I advocate 2-3 backups of any important system (at least one different that the other, offsite if one can afford it). I never said ZFS is so good we don't need backups (that would be a stupid comment). As far as a recovery tool, those sound risky. I'd prefer something without so much risk. Make your own judgement, it is your time and data. I think ZFS is a great filesystem that anyone using FreeBSD or Illumios should be using. -- The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well-meaning but without understanding. -- Justice Louis D. Brandeis