Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Apr 1995 07:34:03 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        nate@trout.sri.MT.net, phk@ref.tfs.com
Cc:        freebsd-hackers@freefall.cdrom.com, kargl@troutmask.apl.washington.edu, rgrimes@gndrsh.aac.dev.com, terry@cs.weber.edu
Subject:   Re: new install(1) utility
Message-ID:  <199504052134.HAA31128@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
>> > obvious if they don't match, and doing cksums on both files would be
>> > much faster than the 'cmp' IMHO.
>> 
>> Funny you should mention, I just ran some experiments (for CTM), and the
>> fastest thing you can do is to mmap both files and memcmp them...

There are many reasons why checksumming might be slower, especially if it
isn't implemented carefully.  Checksumming can only be faster if you can
usually avoid reading the target.  A non-hashed database in a single file
would be very slow.  You would have to use a hashed database.  Writing
the database would add a lot of overhead.  This would be more of a
problem for install than for ctm since files are unfortunately often
installed one at a time so the database would have to be opened and
closed a lot.

>I wonder if this is the case for non-x86 machines as well, since I
>suspect memcpy() uses the fast string routines available on x86
>machines.

What fast string routines?  On i486's, "rep cmpsd" is about than twice as
slow as an unrolled loop written in C, assuming that all the data is in
the cache.  Sequential reads will bust the cache, so all reasonably good
comparison and simple checksum routines will be approximately as slow as
main memory.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199504052134.HAA31128>