From nobody Wed Sep 21 10:58:47 2022 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4MXb574rHTz4d47d for ; Wed, 21 Sep 2022 10:59:07 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) Received: from holgerdanske.com (holgerdanske.com [184.105.128.27]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "holgerdanske.com", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4MXb56439Qz3rnP for ; Wed, 21 Sep 2022 10:59:06 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=holgerdanske.com; s=nov-20210719-112354; t=1663757935; bh=UMOJig2i+aBZRDimcpsQZmEq0y6F1qxrmp+EA/PimyY=; h=Received:Message-ID:Date:MIME-Version:User-Agent:Content-Language: To:From:Subject:Content-Type:Content-Transfer-Encoding; b=vE43ocDo99Msxvrbh16lDqtu9d1+dcKGY0tWasdD7MkTT0ySLVIHqBrVaTU1jVHai +eQ5IX8Ez6IUjJOz6+jbM87vIBmNdnqt6KARs0LkcmBO62Q2MFJq15eTDtGhiFejh8 3eIgwJUS3ta5Mxo+dSGoA5/eK2YhJzUimzTvzUQazqLeUaDwl3Kdw/NEdmH8j/5u/1 EONa5JFk9ypnrQbfujqDM9s+HNppD0DncMbsklzhHGwurw+ZPxukyI5Hb/Rb0BSarA tomGl/ptjJrTgx0i8ETYWQYG87t7dlHAEx36vpHl8cL9/sNExMI6EFormrKuXFUHYR pVsHQ4JqNjiQuedePDqxTo2JrhrG1KuqwJXjGQZMIYNqwGAzYE+u9BZ3XkYeogQpl+ 5usilk63h92OivW62vo0ZoVchhLJfKDyNB66QTA48cwa/KRwh5ZWPw4quHcUq8lpJk tTsZoSYdXI/SqBH77rGzA/PKpGtDdNyyAX7X3LJihaM9z0az19s0eyIFmQaoazPANc ElJBismRf0zigL7sqZVmP3rIPldxFGs9gn347XW3/6M8kvdkKJW1TxUN6Tf2k5GPoT UHwqjkCp77SMQuO3RLKx2Hap6thoJSa2yFou/26YLxhLtuer0/3kC26pz2eKeGnWMH GffeMG38N3vSJwCN2pwaEq0k= Received: from 99.100.19.101 (99-100-19-101.lightspeed.frokca.sbcglobal.net [99.100.19.101]) by holgerdanske.com with ESMTPSA (TLS_AES_128_GCM_SHA256:TLSv1.3:Kx=any:Au=any:Enc=AESGCM(128):Mac=AEAD) (SMTP-AUTH username dpchrist@holgerdanske.com, mechanism PLAIN) for ; Wed, 21 Sep 2022 03:58:55 -0700 Message-ID: Date: Wed, 21 Sep 2022 03:58:47 -0700 List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Content-Language: en-US To: freebsd-questions@freebsd.org From: David Christensen Subject: data, metadata, backup, and archive integrity and correction Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4MXb56439Qz3rnP X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=holgerdanske.com header.s=nov-20210719-112354 header.b=vE43ocDo; dmarc=pass (policy=none) header.from=holgerdanske.com; spf=pass (mx1.freebsd.org: domain of dpchrist@holgerdanske.com designates 184.105.128.27 as permitted sender) smtp.mailfrom=dpchrist@holgerdanske.com X-Spamd-Result: default: False [-3.88 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.995]; NEURAL_HAM_SHORT(-0.96)[-0.955]; NEURAL_HAM_LONG(-0.93)[-0.933]; DMARC_POLICY_ALLOW(-0.50)[holgerdanske.com,none]; R_SPF_ALLOW(-0.20)[+a]; R_DKIM_ALLOW(-0.20)[holgerdanske.com:s=nov-20210719-112354]; MIME_GOOD(-0.10)[text/plain]; DKIM_TRACE(0.00)[holgerdanske.com:+]; ASN(0.00)[asn:6939, ipnet:184.104.0.0/15, country:US]; MLMMJ_DEST(0.00)[freebsd-questions@freebsd.org]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_HAS_DN(0.00)[]; TO_DN_NONE(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: N On 9/21/22 00:33, Ralf Mardorf wrote: > On Tue, 2022-09-20 at 12:00 -0700, David Christensen wrote: >> For off-line back up disks, I find mobile racks to be more reliable >> than USB/ Firewire/ eSATA: > > Hi, > > I tested a lot of casings and started using casings that have both, > USB3 <= 5 Gbit/s and eSATA <= 3Gbit/s plugs and that are powered by > their own power supply. I don't know if everything is powered by the > casings' power supply, parts might still be bus powered. The firmware > of the casings has got no enforced power saving feature, hence the > drives are always spinning, the heads never park, the drives are > always ready for action. USB was reliable when using those casings for > years and it still is almost reliable. However, "was reliable" + > "still is almost reliable" = unreliable. > > In my experiences eSATA <= 3Gbit/s is reliable, but way too slow. > > I never used a mobile rack, but this is something I consider to use in > the future, too. Unfortunately I'm using the external drives by > rotation not only to backup data from a tower/desktop PC that can hold > a rack mount. I'm also using drives with iPadOS, that can only access > an external drive via USB. > > It's not possible to completely abandon USB drives. Once data is saved > by USB and verified it's safe. If restoring data from an USB drive > fails, it's still possible to remove the HDD from the casing and to > connected it by SATA. The casings I'm using provide eSATA, hence I > even don't need to open the casing. > > Fazit: USB drives are a PITA. Most even don't fit the category "was > reliable" + "still is almost reliable", they are often completely > useless, only working for Windows users, that every now and then move a > few GiB and for users that never verify their archives. Many users > notice that their archives are corrupted, when they try to restore > data from an archive, because they never listed the contend after > creating an archive with exit status 0. The exit status 0 from > creating an archive with tar doesn't grant that an archive isn't > corrupted, it only says that no error was noticed, not that no error > happened. Integrity checking of data and metadata is important -- both for live data and, especially, for backups and archives. When corruption is detected (e.g. damaged optical media, bad blocks/ cells, "bit rot"), a correction mechanism is desirable. Traditional filesystems (UFS, ext4), volumes (geom, LVM), RAID (geom, md), etc., may detect failing or failed drives, but are may not detect all forms of corruption. I use Debian Stable on desktops/ workstations and have read about the Linux dm-integrity layer, but dm-integrity does not seem to be fully integrated into Debian (yet). AIUI both ZFS and btrfs both implement integrity checking of data and metadata, and can automatically correct corruption if redundancy is provisioned. I tried btrfs on Debian and found it to be lacking. ZFS is mature and fully integrated on FreeBSD. I migrated my servers to FreeBSD and ZFS. MD5/ SHA256 checksum files are multi-platform, but only cover the data and only provide pass/fail detect for whole files. mtree(8) adds integrity checking for most Unix metadata, but I am unsure if mtree(8) covers ACL's. mtree(8) needs a specification incremental update feature for practical use on large data stores. mtree(8) is not well supported on platforms other than the BSD's. I have written scripts when I wanted these kinds of checks. Restoring backups is a worthwhile exercise. But, then you need to validate the restored copy against the original. If the original has changed over time, you need something like saved mtree(8) specifications. If the validation fails, how do you correct? lzip(1) is an archiver with integrity and correction features. I need to evaluate it. ZFS replication involves producing and consuming a replication stream. The replication stream can be saved to a file and consumed later; by the same computer and/or by one or more other computers. This provides new possibilities for backup and restore. I need to explore them. David