Date: Fri, 23 Sep 2022 18:58:24 -0700 From: David Christensen <dpchrist@holgerdanske.com> To: questions@freebsd.org Subject: Re: data, metadata, backup, and archive integrity and correction Message-ID: <166467a7-5aac-6c67-c462-432912c58211@holgerdanske.com> In-Reply-To: <ed8431a943b8d3cd74e588c7ee9901c6c166eba8.camel@riseup.net> References: <b027f6af-bf83-2663-b0ef-2480e385b189@holgerdanske.com> <c15361a3f8328583bdab528c5a49bf475a1dfdfa.camel@riseup.net> <7be8f47b-bfea-ef5a-7b59-2f94f8d310e2@holgerdanske.com> <ed8431a943b8d3cd74e588c7ee9901c6c166eba8.camel@riseup.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 9/23/22 16:37, Ralf Mardorf wrote:
> On Fri, 2022-09-23 at 15:42 -0700, David Christensen wrote:
>> All versions of the photograph file opened correctly with a viewer.
>> All photographs looked the same on the screen. But, at least one file
>> is corrupt. Which file(s)? I never figured it out. I kept all
>> versions of the file. (And, I have kept all camera media.)
>
> Hi David,
>
> I'm a digital photographer newbie. I started digital photography in
> 2020. For developing photos and more editing, graphic art, I'm using a
> Linux desktop machine, but most of the times an iPad Pro. I copy all my
> cam's SD data to my Linux desktop PC, my iPad Pro and to at least two
> USB HDDs (ext4 and hfs+ without journaling) in the first place. Before I
> format a SD again, I take a look at all copied photos using a viewer.
> The photo backup/archiving is completely "decoupled" from all other
> backup/archiving. Non-destructive editing of photos, done on different
> machines, in my experiences results in chaos. Way before verifying a
> probably corrupted backup, I already loose control. For example, it's
> already impossible to gain control over naming files of edited photos.
> Sharing edited photos among apps running on iPad OS already is a PITA,
> let alone sharing photos among operating systems.
>
> I've got tons of unneeded duplicates of some photos. Deleting a
> duplicated photo might render separately stored meta-data useless.
Two of the features that attracted me to ZFS were de-duplication and
compression. They work great for filesystem copy backups (e.g. rsync).
Here are the backups of my daily driver root filesystem:
2022-09-23 18:23:47 toor@f3 ~
# du -m -s /var/local/backup/laalaa.tracy.holgerdanske.com/
4781 /var/local/backup/laalaa.tracy.holgerdanske.com/
2022-09-23 18:26:30 toor@f3 ~
# ls /var/local/backup/laalaa.tracy.holgerdanske.com/.zfs/snapshot/ | wc -l
203
2022-09-23 18:16:03 toor@f3 ~
# du -m -c -s
/var/local/backup/laalaa.tracy.holgerdanske.com/.zfs/snapshot/*
<snip>
984122 total
2022-09-23 18:31:21 toor@f3 ~
# zfs get all p3/backup/laalaa.tracy.holgerdanske.com | sort | egrep
'compress|used|dedup'
p3/backup/laalaa.tracy.holgerdanske.com compression lz4
inherited from p3
p3/backup/laalaa.tracy.holgerdanske.com compressratio 2.16x
-
p3/backup/laalaa.tracy.holgerdanske.com dedup verify
inherited from p3/backup
p3/backup/laalaa.tracy.holgerdanske.com logicalused 52.9G
-
p3/backup/laalaa.tracy.holgerdanske.com refcompressratio 1.84x
-
p3/backup/laalaa.tracy.holgerdanske.com used 25.3G
-
p3/backup/laalaa.tracy.holgerdanske.com usedbychildren 0
-
p3/backup/laalaa.tracy.holgerdanske.com usedbydataset 4.62G
-
p3/backup/laalaa.tracy.holgerdanske.com usedbyrefreservation 0
-
p3/backup/laalaa.tracy.holgerdanske.com usedbysnapshots 20.6G
-
So, 4.62G source filesystem size, 203 backups, 961G apparent size of the
backups, 52.9G de-deduplicated size of the backups, and 25.3G compressed
and de-deduplicated size of the backups. So, ZFS de-duplication and
compression of the backups provided a savings of about 38:1. Without
ZFS, I would have far fewer backups.
But de-duplication and compression of other data is debatable.
Photograph files are already compressed; so ZFS compression will be
useless. 10 copies of the exact same photograph file should
de-duplicate nicely. But, open a photograph file in an editor, make
some changes, save as another file, and repeat 8 more times is likely to
result in 10 files all with different blocks; so ZFS de-duplication will
be useless.
David
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?166467a7-5aac-6c67-c462-432912c58211>
