From owner-svn-src-all@freebsd.org Thu Apr 4 23:32:28 2019 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0E7F51560233; Thu, 4 Apr 2019 23:32:28 +0000 (UTC) (envelope-from cem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A20CA6B3FF; Thu, 4 Apr 2019 23:32:27 +0000 (UTC) (envelope-from cem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 796734B95; Thu, 4 Apr 2019 23:32:27 +0000 (UTC) (envelope-from cem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x34NWRNw029050; Thu, 4 Apr 2019 23:32:27 GMT (envelope-from cem@FreeBSD.org) Received: (from cem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x34NWR0J029049; Thu, 4 Apr 2019 23:32:27 GMT (envelope-from cem@FreeBSD.org) Message-Id: <201904042332.x34NWR0J029049@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: cem set sender to cem@FreeBSD.org using -f From: Conrad Meyer Date: Thu, 4 Apr 2019 23:32:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r345896 - head/usr.bin/sort X-SVN-Group: head X-SVN-Commit-Author: cem X-SVN-Commit-Paths: head/usr.bin/sort X-SVN-Commit-Revision: 345896 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: A20CA6B3FF X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.94 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_SHORT(-0.95)[-0.946,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Apr 2019 23:32:28 -0000 Author: cem Date: Thu Apr 4 23:32:27 2019 New Revision: 345896 URL: https://svnweb.freebsd.org/changeset/base/345896 Log: sort(1): randomcoll: Skip the memory allocation entirely There's no reason to order based on strcmp of ASCII digests instead of memcmp of the raw digests. While here, remove collision fallback. If you collide two MD5s, they're probably the same string anyway. If robustness against MD5 collisions is desired, maybe we shouldn't use MD5. None of the behavior of sort -R is specified by POSIX, so we're free to implement this however we like. E.g., using a 128-bit counter and block cipher to generate unique indices for each line of input. PR: 230792 (2/many) Relnotes: This will change the sort order for a given dataset with a given seed. Other similarly breaking changes are planned. Sponsored by: Dell EMC Isilon Modified: head/usr.bin/sort/coll.c Modified: head/usr.bin/sort/coll.c ============================================================================== --- head/usr.bin/sort/coll.c Thu Apr 4 23:30:27 2019 (r345895) +++ head/usr.bin/sort/coll.c Thu Apr 4 23:32:27 2019 (r345896) @@ -990,8 +990,7 @@ randomcoll(struct key_value *kv1, struct key_value *kv { struct bwstring *s1, *s2; MD5_CTX ctx1, ctx2; - char *b1, *b2; - int cmp_res; + unsigned char hash1[MD5_DIGEST_LENGTH], hash2[MD5_DIGEST_LENGTH]; s1 = kv1->k; s2 = kv2->k; @@ -1004,24 +1003,16 @@ randomcoll(struct key_value *kv1, struct key_value *kv if (s1 == s2) return (0); - memcpy(&ctx1,&md5_ctx,sizeof(MD5_CTX)); - memcpy(&ctx2,&md5_ctx,sizeof(MD5_CTX)); + memcpy(&ctx1, &md5_ctx, sizeof(MD5_CTX)); + memcpy(&ctx2, &md5_ctx, sizeof(MD5_CTX)); MD5Update(&ctx1, bwsrawdata(s1), bwsrawlen(s1)); MD5Update(&ctx2, bwsrawdata(s2), bwsrawlen(s2)); - b1 = MD5End(&ctx1, NULL); - b2 = MD5End(&ctx2, NULL); - if (b1 == NULL || b2 == NULL) - err(2, "MD5End"); - cmp_res = strcmp(b1,b2); - sort_free(b1); - sort_free(b2); + MD5Final(hash1, &ctx1); + MD5Final(hash2, &ctx2); - if (!cmp_res) - cmp_res = bwscoll(s1, s2, 0); - - return (cmp_res); + return (memcmp(hash1, hash2, sizeof(hash1))); } /*