From owner-freebsd-questions@FreeBSD.ORG Sat Aug 23 09:01:48 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5ECAA1065672 for ; Sat, 23 Aug 2008 09:01:48 +0000 (UTC) (envelope-from m.seaman@infracaninophile.co.uk) Received: from smtp.infracaninophile.co.uk (gate6.infracaninophile.co.uk [IPv6:2001:8b0:151:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id D03308FC1B for ; Sat, 23 Aug 2008 09:01:47 +0000 (UTC) (envelope-from m.seaman@infracaninophile.co.uk) Received: from happy-idiot-talk.infracaninophile.co.uk (localhost [IPv6:::1]) (authenticated bits=0) by smtp.infracaninophile.co.uk (8.14.2/8.14.2) with ESMTP id m7N91dus008349; Sat, 23 Aug 2008 10:01:41 +0100 (BST) (envelope-from m.seaman@infracaninophile.co.uk) X-DKIM: Sendmail DKIM Filter v2.7.0 smtp.infracaninophile.co.uk m7N91dus008349 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=infracaninophile.co.uk; s=200708; t=1219482101; bh=5ej1xMq/x81qJ4 MWJc9HIK8qtb/e+efTIuky2LqpdAE=; h=Message-ID:Date:From:MIME-Version: To:CC:Subject:References:In-Reply-To:Content-Type:Cc:Content-Type: Date:From:In-Reply-To:Message-ID:Mime-Version:References:To; z=Mes sage-ID:=20<48AFD1ED.5070800@infracaninophile.co.uk>|Date:=20Sat,=2 023=20Aug=202008=2010:01:33=20+0100|From:=20Matthew=20Seaman=20|Organization:=20Infracaninophile|User -Agent:=20Thunderbird=202.0.0.16=20(X11/20080726)|MIME-Version:=201 .0|To:=20Walt=20Pawley=20|CC:=20freebsd-questions@fr eebsd.org,=20Oliver=20Fromme=20|Subject:=20 Re:=20sed/awk,=20instead=20of=20Perl|References:=20<200808220759.m7 M7xuh0047625@lurza.secnetix.de>=20|In-Reply-To:=20|X-Enigmail-Ve rsion:=200.95.6|Content-Type:=20multipart/signed=3B=20micalg=3Dpgp- sha256=3B=0D=0A=20protocol=3D"application/pgp-signature"=3B=0D=0A=2 0boundary=3D"------------enig86B0024FDAB8CBABBAE93E8C"; b=bka2cWhID fNxLiisvR/+X+qFu4BWc62iWABKHJZdUrNiKYna1zfcr19VlmcXfhUS9qs1nOSlS7Op MkwRVrSmSnq4SOI+EOoOr6+5sP+fg3JmOJr4Gq4utS7YNXgp6cDWR6ekl5qZO0FrFwI weibU8eoyv5h4PyNnICZHVN/UmDc= Message-ID: <48AFD1ED.5070800@infracaninophile.co.uk> Date: Sat, 23 Aug 2008 10:01:33 +0100 From: Matthew Seaman Organization: Infracaninophile User-Agent: Thunderbird 2.0.0.16 (X11/20080726) MIME-Version: 1.0 To: Walt Pawley References: <200808220759.m7M7xuh0047625@lurza.secnetix.de> In-Reply-To: X-Enigmail-Version: 0.95.6 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enig86B0024FDAB8CBABBAE93E8C" X-Virus-Scanned: ClamAV 0.93.3/8077/Sat Aug 23 08:28:12 2008 on happy-idiot-talk.infracaninophile.co.uk X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VERIFIED, FB_WORD1_END_DOLLAR, NO_RELAYS autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on happy-idiot-talk.infracaninophile.co.uk Cc: Oliver Fromme , freebsd-questions@freebsd.org Subject: Re: sed/awk, instead of Perl X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2008 09:01:48 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig86B0024FDAB8CBABBAE93E8C Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Walt Pawley wrote: > At 9:59 AM +0200 8/22/08, Oliver Fromme wrote: >=20 >> - The perl command you wrote above is pretty much a sed >> command anyway (except you incorrectly used non-portable >> regular expression syntax). Why use perl to execute a >> sed command? >=20 > At the risk of beating this to death, I just happened to > stumble on a real world example of why one might want to use > Perl for sed-ly stuff. I wanted to pull off the accessor's > address from each line of an Apache access log file. So, I > figured after this discussion that sed was the way to go. Then > I got curious and did the following: >=20 > wump$ ls -l Desktop/klog > -rw-r--r-- 1 wump 1001 52753322 22 Aug 16:37 Desktop/klog > wump$ time sed "s/ .*//" Desktop/klog > kadr1 >=20 > real 0m10.800s > user 0m10.580s > sys 0m0.250s > wump$ time perl -pe 's/ .*//' Desktop/klog > kadr2 >=20 > real 0m0.975s > user 0m0.700s > sys 0m0.270s > wump$ cmp kadr1 kadr2 > wump$ >=20 > Why disparity in execution speed? Beats me, but my G5's fans > started to take off running the sed command. I don't think the > Perl command took long enough to register thermally. Curious. >=20 > FWIW: I did this with an older version of Mac OS X, rather > FreeBSD so it could easily not show the same results if I moved > the log file to a FreeBSD box and did it there. Careful now. Have you accounted for the effect of the klog file being cached in VM rather than having to be read afresh from disk? It makes a very big difference in how fast it is processed. In order to get meaningful data for this sort of test you should do a dummy run or two of each command in fairly quick succession, and then repeat your test runs a number of times and look at the average and standard deviation of the execution times. You'll often see "Student's T test" mentioned -- that's a statistical test for assessing if results calculated from a limited number of samples represent different underlying distributions. It sounds horribly complicated, but nowadays we have computers to do all the difficult adding up and the result is just a number that tells you how well your supposition (that command 'a' is faster than command 'b') is supported by your results. There's a neat little script somewhere that will automate that, and even give you an ascii graph output, but I cannot for the life of me remember what it's called. Sorry. Cheers, Matthew --=20 Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW --------------enig86B0024FDAB8CBABBAE93E8C Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEAREIAAYFAkiv0fMACgkQ8Mjk52CukIyHWQCgioniIeKcaaqnoASNUMTDs7cD zmQAniqb2huOqDPGxELh9G65w3qLoUbH =NXLU -----END PGP SIGNATURE----- --------------enig86B0024FDAB8CBABBAE93E8C--