Date: Mon, 12 Mar 2001 02:45:25 -0500 (EST) From: Trevor Johnson <trevor@jpj.net> To: Kris Kennaway <kris@obsecurity.org> Cc: <ports@FreeBSD.ORG>, Alistair Crooks <agc@pkgsrc.org> Subject: Re: new message digest support in pkgsrc (fwd) Message-ID: <20010310215713.Q23492-100000@blues.jpj.net> In-Reply-To: <20010310180103.A28745@mollari.cthul.hu>
next in thread | previous in thread | raw e-mail | index | archive | help
> We have two utilities in the base system which calculate > MD5/SHA1/RIPEMD160 hashes (md5 and openssl). Actually, looks like md5 > only does md5, I thought it did the others too -- what is true is that > we have two libraries which handle it -- libmd and libcrypto (and > adding code to md5(1) would be trivial). You're right. If no one has any objection, I'll delete the digest port. As for overloading the md5 utility, it seems counter-intuitive to run a command called "md5" and get some other message digest from it. The OpenBSD folks have taken a similar approach. They have one binary which is hard-linked under different names: $ ls -li `which md5` `which rmd160` `which sha1` 11545 -r-xr-xr-x 3 root wheel 69632 Nov 6 09:10 /bin/md5 11545 -r-xr-xr-x 3 root wheel 69632 Nov 6 09:10 /bin/rmd160 11545 -r-xr-xr-x 3 root wheel 69632 Nov 6 09:10 /bin/sha1 (someone told me that using argv(0) that way is a bad practice, perhaps because the command won't work properly if it is renamed). They also have openssl, under /usr/ and dynamically linked (like ours): $ ldd `which md5` `which openssl` ldd: /bin/md5: not a dynamic executable /usr/sbin/openssl: -lssl.2 => /usr/lib/libssl.so.2.4 (0x4004f000) -lcrypto.2 => /usr/lib/libcrypto.so.2.4 (0x4007e000) -lc.25 => /usr/lib/libc.so.25.2 (0x40127000) Maybe they're paranoid about stuff that could be on a shared /usr/. > I question the motivation for the NetBSD change. There are some > theoretical weaknesses in MD5, but they aren't known to impact > real-world uses. At http://www.acm.org/pubs/citations/proceedings/commsec/191177/p210-van_oorschot/ I found the abstract of a 1994 paper, which says: [...] a $10 million custom machine for applying parallel collision search to the MD5 hash function could complete an attack with an expected run time of 24 days. I haven't read the whole paper, but I conjecture that this parallel method could work, less efficiently, on an array of compromised, general-purpose microcomputers connected through the Internet. A black-hat cracking effort similar to this is described at http://distributed.net/trojans.html.en . An article in CryptoBytes, "the technical newsletter of RSA Laboratories," published in 1996, says: The presented attack does not yet threaten practical applications of MD5, but it comes rather close. In view of the flexibility of the new analytic techniques it would be unwise to assume that the attack could not be improved. Ron Rivest [16] commented on the status of MD4, after two-round attacks had been found, that it is "at the edge" in terms of risking successful cryptanalytic attack. Today this assessment characterizes the status of MD5. Therefore we suggest that in the future MD5 should no longer be implemented in applications like signature schemes, where a collision-resistant hash function is required. According to our present knowledge, the best recommendations for alternatives to MD5 are SHA-1 and RIPEMD-160. The newsletter is at ftp://ftp.rsasecurity.com/pub/cryptobytes/crypto2n1.pdf . RFC 1828, written in 1995, says (brackets are in original): At the time of writing of this document, it is known to be possible to produce collisions in the compression function of MD5 [dBB93]. There is not yet a known method to exploit these collisions to attack MD5 in practice, but this fact is disturbing to some authors [Schneier94]. It has also recently been determined [vOW94] that it is possible to build a machine for $10 Million that could find two chosen text variants with a common MD5 hash value. However, it is unclear whether this attack is applicable to a keyed MD5 transform. This attack requires approximately 24 days. The same form of attack is useful on any iterated n-bit hash function, and the time is entirely due to the 128-bit length of the MD5 hash. Although there is no substantial weakness for most IP security applications, it should be recognized that current technology is catching up to the 128-bit hash length used by MD5. Applications requiring extremely high levels of security may wish to move in the near future to algorithms with longer hash lengths. I've heard that in a situation in which a hostile party can generate a message with innocent contents, present it to a trusted party for signing, then replace the message with one having hostile contents, the hostile party can more easily arrange a hash collision than in a situation where the innocent message is generated for innocent purposes. Either scenario can credibly happen in the ports collection. The RIPEMD-160 home page at http://www.esat.kuleuven.ac.be/~bosselae/ripemd160.html cites the same articles. It says that RIPEMD-160 was designed to replace RIPEMD-128 because of the ACM paper: RIPEMD-128 is a plug-in substitute for RIPEMD (or MD4 and MD5, for that matter) with a 128-bit result. In view of the result of Paul van Oorschot and Mike Wiener mentioned earlier, 128-bit hash results do not offer sufficient protection for the next ten years, and applications using 128-bit hash functions should consider upgrading to a 160-bit hash function. It also says that some aspects of SHA-1 are kept secret by the U.S. government (cue X-Files theme). > I think switching to SHA1 for buzzword-compliance would be gratuitous. Likewise, avoiding it purely because it has become a buzzword would be a poor decision. The SHA-1 algorithm is described at http://www.rsasecurity.com/rsalabs/faq/3-6-5.html as "more secure" than MD5 (MD5 is a trademark of RSA Security, for whom the algorithm was developed). > Even more ludicrous would be something like what OpenBSD does: > > MD5 (scanssh-1.4.tar.gz) = 843796cdb9361ed7e3d862a0e3a6ce16 > RMD160 (scanssh-1.4.tar.gz) = 8825be05348f1d5e8f53657a0de65f9b81320413 > SHA1 (scanssh-1.4.tar.gz) = 266d9de9a7965177b5d10ec0eed3de3e199ac237 At first glance, this looks crazy, but I see some advantages to it: it is "upward compatible" (so is the NetBSD way); users who find "make checksum" too slow can still use the MD5 hash (or one or two of the others); if one of the other hashes is shown to be weak, there's no need to panic because the other hash can be used, and has already been generated right after the porter looked at the contents of the distfile; there was no need for a flag day or to suddenly generate hundreds of new pkg/md5 files when the change was made (just over two years ago). Two disadvantages are apparent. One is that "make makesum" must run more slowly. A porter who feels inconvenienced by this could choose to only provide the MD5 checksum, as before. The other is that md5 (for us, distinfo) files in CTM diffs or tarballs are bigger (on disk, most will still take up one block, on the usual filesystems). CTM diffs and ports tarballs on CD-ROMs are normally compressed, and with a compression utility worth its salt, 160-byte hashes should only take up about 160 bytes (20 bytes). IMO these disadvantages are trivial. The change for OpenBSD can be viewed at http://www.openbsd.org/cgi-bin/cvsweb/ports/infrastructure/mk/bsd.port.mk.diff?r1=1.74&r2=1.75 and for NetBSD, at http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/mk/bsd.pkg.mk.diff?r1=1.675&r2=1.676 . Until Moore's Law is repealed, MD5 will only become less difficult to crack. Cryptographic experts have been recommending its replacement for some purposes since at least 1995. Better (longer) hash functions can be calculated by openssl, which is in our base system. The NetBSD and OpenBSD projects have adopted these functions for their ports (pkgsrc) collections. The desirability of keeping more information about distfiles was anticipated by us during last year's reorganization (http://www.geocrawler.com/mail/msg.php3?msg_id=4418223&list=167), so the "md5" files have already been renamed. I'd like to see: - the 160-byte hashes permitted (not required) in the distinfo file. - a "makesum" target which generates all three hashes, using openssl. - a "checksum" target which uses whichever hashes exist in distinfo. -- Trevor Johnson http://jpj.net/~trevor/gpgkey.txt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-ports" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010310215713.Q23492-100000>