Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Sep 2004 17:25:32 -0700
From:      Colin Percival <cperciva@wadham.ox.ac.uk>
To:        Giorgos Keramidas <keramida@freebsd.org>
Cc:        freebsd-security@freebsd.org
Subject:   compare-by-hash (was Re: sharing /etc/passwd)
Message-ID:  <41575DFC.9020206@wadham.ox.ac.uk>
In-Reply-To: <20040925140242.GB78219@gothmog.gr>
References:  <Pine.LNX.4.33.0111071900280.24824-100000@moroni.pp.asu.edu> <20011107211316.A7830@nomad.lets.net> <20040925140242.GB78219@gothmog.gr>

next in thread | previous in thread | raw e-mail | index | archive | help
Giorgos Keramidas wrote:
> After reading a nice paper by Val Henson[1] I'm not so sure I'd trust
> sensitive information like password data to rsync without making sure
> that compare-by-hash is disabled if at all possible.

If you're going to disable compare-by-hash, you might as well just use 
rcp; but there's no theoretical justification for disabling 
compare-by-hash.  Henson's paper points out a number of cases where 
hashing causes problems, but none of these are issues with hashing 
itself; rather, the problems arise from using hashing with an 
insufficient number of bits.  Obviously if you're searching for SHA1 
collisions, you shouldn't index your data by SHA1 hash... you shouldn't 
use a 160 bit hash *any* time that you're doing O(2^80) work.

The above notwithstanding, rsync's choice of hash function leaves 
something to be desired; MD4 is completely broken, and (while it is 
still adequate for random inputs) it is easy to construct files which 
rsync will incorrectly synchronize.  (It's also trivial to construct 
files which consume ~100 times more cpu time on the server than normal 
-- unfortunately, this is an unavoidable consequence of using a fixed 
rolling hash function.)

Colin Percival



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?41575DFC.9020206>