From owner-svn-src-all@freebsd.org Sat Jan 2 20:19:42 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 28CE7A5FB0B; Sat, 2 Jan 2016 20:19:42 +0000 (UTC) (envelope-from allanjude@freebsd.org) Received: from mx1.scaleengine.net (mx1.scaleengine.net [209.51.186.6]) by mx1.freebsd.org (Postfix) with ESMTP id DE2C41AD5; Sat, 2 Jan 2016 20:19:41 +0000 (UTC) (envelope-from allanjude@freebsd.org) Received: from [10.1.1.2] (unknown [10.1.1.2]) (Authenticated sender: allanjude.freebsd@scaleengine.com) by mx1.scaleengine.net (Postfix) with ESMTPSA id 8A690D899; Sat, 2 Jan 2016 20:19:40 +0000 (UTC) Subject: Re: svn commit: r292955 - head/lib/libmd To: Bruce Evans References: <201512301804.tBUI4oGp065466@repo.freebsd.org> <20151231115651.R995@besplex.bde.org> <20151231143314.Y1520@besplex.bde.org> <5684D606.3080609@freebsd.org> <56857911.5010205@freebsd.org> <56876DF1.4030807@freebsd.org> <20160102210313.M934@besplex.bde.org> Cc: "Jonathan T. Looney" , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org From: Allan Jude X-Enigmail-Draft-Status: N1110 Message-ID: <568830DD.4080006@freebsd.org> Date: Sat, 2 Jan 2016 15:19:41 -0500 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <20160102210313.M934@besplex.bde.org> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Wi6QfO8nTjV1lrg40kwbNOGXqaFa80LtA" X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Jan 2016 20:19:42 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Wi6QfO8nTjV1lrg40kwbNOGXqaFa80LtA Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 2016-01-02 05:07, Bruce Evans wrote: > On Sat, 2 Jan 2016, Allan Jude wrote: >=20 >> On 2015-12-31 13:50, Allan Jude wrote: >>> On 2015-12-31 13:32, Jonathan T. Looney wrote: >>>> On 12/31/15, 2:15 AM, "Allan Jude" wrote: >>>> >>>>> It seems these problems also slow things down, a lot: >>>>> >>>>> # time md5 /media/md5test/bigdata >>>>> MD5 (/media/md5test/bigdata) =3D 6afad0bf5d8318093e943229be05be67 >>>>> 4.310u 3.476s 0:07.79 99.8% 20+167k 0+0io 0pf+0w >>>>> # time env LD_PRELOAD=3D/usr/obj/media/svn/md5/head/tmp/lib/libmd.s= o >>>>> /usr/obj/media/svn/md5/head/sbin/md5/md5 /media/md5test/bigdata >>>>> MD5 (/media/md5test/bigdata) =3D 6afad0bf5d8318093e943229be05be67 >>>>> 4.133u 0.354s 0:04.49 99.7% 20+167k 1+0io 0pf+0w >>>>> >>>>> (file is fully cached in ZFS ARC, dd reads it at 11GB/s) >>>>> >>>>> Will investigate more tomorrow. >>>> >>>> md5 will be slower than dd due to the extra processing it needs to >>>> do to >>>> generate the hash. I suspect that explains the difference you're see= ing >>>> between those utilities. >>> >>> Sorry, you missed my point here. >>> >>> I replaced MDXFile() with the implementation included in my earlier >>> email. Using the newer libmd with that code, cut the time to md5 the >>> SAME data down a lot. I need to do a more scientific test on a box th= at >>> isn't doing other stuff still though. >>> >>> The comment about dd doing 11GB/s, was just to clarify that I wasn't >>> reading the file from disk, which would introduce other variables. >> >> I found the cause of my bogus benchmark, the world on my test machine >> was just old enough to be missing jmg@'s bufsize patch. >> >> Now the difference is about 1 second on a 2GB file, so ignore my >> foolishness. >=20 > That patch is surprisingly new. >=20 > The main slowness that I complained about was for the other path in md5= > that must be used for special files. That uses stdio so it suffers fro= m > stdio trusting st_blksize. But st_blksize is rarely as small as the ol= d > size BUFSIZ in MDXFile. >=20 > Bruce >=20 I did some experiments on MDXFilter, adjusting the buffer size fo 16kb, and using setvbuf() on stdin before reading from it. It improved things, but only marginally. dd if=3D/mnt/bigzerofile bs=3D1m | md5 10 GB took 80 seconds for unmodified md5, and 73.5 seconds with the bigger buffer size. I will try to setup and flamegraph it, and see if we can determine what can be done to make it faster. --=20 Allan Jude --Wi6QfO8nTjV1lrg40kwbNOGXqaFa80LtA Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQIcBAEBAgAGBQJWiDDgAAoJEBmVNT4SmAt++V4QAIFqNPQfNz70nWB4H9FLjwNv /wYiRttW4F2qKpYTutWBihciRCFDwifLoVmdeUg4BJd+j63pOgpO3Y74pdpxkA2k DEym4FIdMAlVTG2sw8PqxZlbpxEKU1qr+RIercR7AmiPyGZpkoHtR+3USWFNpR+A 6PoLpUr77TbwfXEhlegueiwnTWeu3OmqqJZTQ8+CJdn6za7hVeNlPLVYWvOnPYvx 65gLzLuy/cgwtb9CvabmTF577Vyi9nxGrm5SQIUIo5T40J8OJmLc9JhYZaTZ4qwg pATgWxR3JoFVY0ZKloveWMUmB3FryVkPBfhHvGZTJ5VdVpc7PK5W6Uk/RX5EoqSJ UB1Bmp0YnOafKuHzLLmzH24AcC4DShZiYwljaXJ/k+S7leZpjecyhb0veuTgMYvj IYrtaVJY2zMaUJw9VPI2ksyFwXZ4e/jcSoGgsMuzHDm6Q+74jBv8D8PVZjhOKZ+1 hNzccQJqqTzjXOMNOX9xUqr+R4bAmzIIr63xQcZC/M9V46u0B15ZFi1VqEQkDrxy tOvhvmh8BFrSyNZIxiFGLX6TePVMjCDvMpyGq+Z1nfWl1VOuUlLeNBiX/EjtBhIy m710LajAvRU+YiYYWPeVEpnL33XQI/xqIQULq8UAXlKz139tePe47BROGs/ySikt /QlrSshZw7NqOh8Z+UWI =oG7U -----END PGP SIGNATURE----- --Wi6QfO8nTjV1lrg40kwbNOGXqaFa80LtA--