From owner-freebsd-questions@FreeBSD.ORG Sat Sep 5 18:16:06 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E6098106566B for ; Sat, 5 Sep 2009 18:16:06 +0000 (UTC) (envelope-from corky1951@comcast.net) Received: from QMTA12.westchester.pa.mail.comcast.net (qmta12.westchester.pa.mail.comcast.net [76.96.59.227]) by mx1.freebsd.org (Postfix) with ESMTP id 942AC8FC16 for ; Sat, 5 Sep 2009 18:16:06 +0000 (UTC) Received: from OMTA03.westchester.pa.mail.comcast.net ([76.96.62.27]) by QMTA12.westchester.pa.mail.comcast.net with comcast id d5pV1c0040bG4ec5C6G6n4; Sat, 05 Sep 2009 18:16:06 +0000 Received: from comcast.net ([98.203.142.76]) by OMTA03.westchester.pa.mail.comcast.net with comcast id d6G51c0041f6R9u3P6G5fu; Sat, 05 Sep 2009 18:16:06 +0000 Received: by comcast.net (sSMTP sendmail emulation); Sat, 05 Sep 2009 11:16:03 -0700 Date: Sat, 5 Sep 2009 11:16:03 -0700 From: Charlie Kester To: freebsd-questions@freebsd.org Message-ID: <20090905181603.GA387@comcast.net> Mail-Followup-To: freebsd-questions@freebsd.org References: <64c038660909050933h25a91edcw56688993f5557ad2@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <64c038660909050933h25a91edcw56688993f5557ad2@mail.gmail.com> X-Mailer: Mutt 1.5.20 X-Composer: VIM 7.2 User-Agent: Mutt/1.5.20 (2009-06-14) Subject: Re: Is there such thing as a 'soft checksum' tool? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Sep 2009 18:16:07 -0000 On Sat 05 Sep 2009 at 09:33:03 PDT Modulok wrote: >List, > >I'm not even sure such a tool exists, but it's worth asking: > >I'm looking for a pseudo-checksum tool for use with catalogging >images. For example, a strict checksum algorithm, like the sha family, >will produce a dramatically different checksum for two files which >differ by only a single bit. I'm looking for something where two >images images, which are similar, get a proportionally similar >checksum. When I speak of similarities I'm referring to their image >patterns. i.e two images of differing sizes, which are otherwise >identical, would produce very similar checksums. So the closer the >checksums are, the more similar two given images are. > >Does anyone know of anything like this? libpuzzle might be what you're looking for. There's a tool called ftwin that uses libpuzzle to find duplicate or only-slightly-modified files. http://libpuzzle.pureftpd.org/project/libpuzzle http://jok.is-a-geek.net/ftwin.php Both of these are in the portstree. ;-)