Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Oct 2012 22:28:03 -0400
From:      Mike Jeays <mike.jeays@rogers.com>
To:        freebsd-questions@freebsd.org
Cc:        =?UTF-8?B?0JDQstGB0YLQuNC9INCa0LjQvA==?= <avstin@mail.ru>
Subject:   Re: cksum entire dir??
Message-ID:  <20121004222803.459c4fa6@europa>
In-Reply-To: <1349400979.263410106@f86.mail.ru>
References:  <1349400979.263410106@f86.mail.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 05 Oct 2012 05:36:19 +0400
=D0=90=D0=B2=D1=81=D1=82=D0=B8=D0=BD =D0=9A=D0=B8=D0=BC <avstin@mail.ru> wr=
ote:

> Hi, all,
>=20
> > Paul Kraus <paul at kraus-haus.org> writes:
> >
> > > On Tue, Sep 11, 2012 at 9:18 PM,  <kpneal at pobox.com> wrote:
> > >
> > >> It's a real shame Unix doesn't have a really good tool for comparing
> > >> two directory trees. You can use 'diff -r' (even on binaries), but t=
hat
> > >> fails if you have devices, named pipes, or named sockets in the
> > >> filesystem. And diff or cksum don't tell you if symlinks are differe=
nt.
> > >> Plus you may care about file ownership, and that's where the stat
> > >> command comes in handy.
> > >
> > > Solaris and a least a few versions of Linux have a "dircmp" command
> > > that is in reality a wrapper for diff that handles special files. The
> > > problem with it is that it tends to be slow (I had to validate
> > > millions of files).
> >
> > It's not clear what the danger profile is supposed to be here; dircmp
> > (and recursing 'diff' applications) can handle many cases, but mtree(8)
> > (with appropriate options) covers more pathological problems. Even so,
> > analysis of changes in file nodes like named sockets will usually
> > require some understanding of the application.
> >
> > I suspect that either a recursive diff or an mtree specification is a
> > good solution for the original poster's problem, but we don't have
> > enough information to be more sure than that.
> >
> > Be well.
> >        Lowell
>=20
> I happened to be restoring my home directory on my local machine and need=
ed a way to verify that its contents were in sync with the corresponding di=
rectories on a remote server.  I first tried looking for an option for _rsy=
nc_ that would check synchronization without actually forcibly synchronizin=
g one side to the other unidirectionally, but couldn't find precisely what =
I was looking for.  I happened to come upon this thread, which was a coinci=
dence that this same issue recently came up again.
>=20
> Obviously there must be more rigorous, secure, and industrial-strength wa=
ys to check synchronization between corresponding directories on remote sys=
tems (apart from doing a one-way sync with _rsync_), but here's my two bits=
, a quick crack at a shell function to check recursively that the contents =
of two directories (and the filenames contained therein) have a high probab=
ility of being in sync:
>=20
> ####BEGIN CUT
>=20
> # s:  Function to compute recursive MD5 sum.
> s ( ) {
>   if [ -d "$1" ]
>      then DIR=3D$1
>      else DIR=3D.
>   fi
>   if [ `uname` =3D Linux ]
>      then find "$DIR" -type f -or -type l |sort |tr \\n \\0 |xargs -0 ope=
nssl \
>             dgst |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ |tee /tm=
p/dgst
>           openssl dgst </tmp/dgst
>      else find -s "$DIR" -type f -or -type l    |tr \\n \\0 |xargs -0 md5=
 \
>                  |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ |tee /tm=
p/dgst
>           md5 </tmp/dgst
>   fi
>   unset DIR
>   rm /tmp/dgst
>   return
>   }
>=20
> # sq:  Function to compute recursive MD5 sum quietly.
> sq ( ) {
>   if [ -d "$1" ]
>      then DIR=3D$1
>      else DIR=3D.
>   fi
>   if [ `uname` =3D Linux ]
>      then find "$DIR" -type f -or -type l |sort |tr \\n \\0 |xargs -0 ope=
nssl \
>             dgst |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ >/tmp/dg=
st
>           openssl dgst </tmp/dgst
>      else find -s "$DIR" -type f -or -type l    |tr \\n \\0 |xargs -0 md5=
 \
>                  |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ >/tmp/dg=
st
>           md5 </tmp/dgst
>   fi
>   unset DIR
>   rm /tmp/dgst
>   return
>   }
>=20
> ####END CUT
>=20
> These functions simply apply the `find ... |xargs' method suggested by pr=
evious posts to output a list of MD5 digests with filenames, and then just =
_md5_ the resulting file.  I tried out the above in both sh(1) in FreeBSD (=
my local machine) as well as in ksh(1) in Linux (the remote server), though=
 I haven't tested them extensively.  Obviously the above are not `secure,' =
and obviously an infinite number of variations are possible (such as, for e=
xample, also outputting file permissions and dates of last modification wit=
h ls(1) to the digest file before running _md5_ on it, to check that permis=
sions and dates are also in sync).  Thanks to the previous posters for solv=
ing my problem!  :)
>=20
> All the best,
> Austin

"rsync --dry-run" may be a simple solution that would meet your needs? You =
might need to add the "--delete" option.

Take another look at man rsync.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121004222803.459c4fa6>