Date: Wed, 18 Jul 2012 08:43:57 +0200 From: Kai Gallasch <gallasch@free.de> To: CH <freebsd-fs@ch.pkts.ca> Cc: freebsd-fs@freebsd.org Subject: Re: Can you list internal checksums of a ZFS filesystem? Message-ID: <6D778EEA-5B8F-4F59-B198-E5B098F3AE2C@free.de> In-Reply-To: <20120717152629.42e0641e@fedora14-x86-64.shechinah.mi.microbiology.ubc.ca> References: <20120717152629.42e0641e@fedora14-x86-64.shechinah.mi.microbiology.ubc.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
Am 18.07.2012 um 00:26 schrieb CH: >=20 > Hello list, >=20 > I'm moving data to a ZFS filesystem, and it's a ton of big files (more > than 3 terabytes). I don't trust the network copy command completely, > and so I'd like to compare checksums. I'm not looking forward to it, > since it's going to be a slow process, especially if I can't run the > command on the server.=20 You could use rsync for transfering the data. According to its man page rsync calculates checksums for transfered = files and on its initial run compares checksums on the sending and = receiving side for each file: = http://www.freebsd.org/cgi/man.cgi?query=3Drsync&apropos=3D0&sektion=3D0&m= anpath=3DFreeBSD+Ports&arch=3Ddefault&format=3Dhtml -c, --checksum This changes the way rsync checks if the files have been = changed and are in need of a transfer. Without this option, rsync = uses a "quick check" that (by default) checks if each file's = size and time of last modification match between the sender and = receiver. This option changes this to compare a 128-bit checksum = for each file that has a matching size. Generating the checksums = means that both sides will expend a lot of disk I/O reading = all the data in the files in the transfer (and this is prior = to any reading that will be done to transfer changed files), = so this can slow things down significantly. The sending side generates its checksums while it is = doing the file-system scan that builds the list of the available = files. The receiver generates its checksums when it is = scanning for changed files, and will checksum any file that has the = same size as the corresponding sender's file: files with either a = changed size or a changed checksum are selected for transfer. Note that rsync always verifies that each transferred = file was correctly reconstructed on the receiving side by = checking a whole-file checksum that is generated as the file is = trans- ferred, but that automatic after-the-transfer = verification has nothing to do with this option's before-the-transfer = "Does this file need to be updated?" check. For protocol 30 and beyond (first supported in = 3.0.0), the checksum used is MD5. For older protocols, the checksum = used is MD4. So at the first run starting rsync without -c switch and on a second = run with -c should be quite sufficient for making sure, data has not = changed after being transfered. (Except of course, the underlying = filesystem layers lie about this to the application or a wrongly = implemented MD5 in rsync :-) Also rsync makes it possible to transfer the data in severeal runs, at = times most convenient to you (or your network). It also supports a switch for limiting bandwith usage.. Have a nice day, Kai.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6D778EEA-5B8F-4F59-B198-E5B098F3AE2C>