From owner-freebsd-questions@FreeBSD.ORG Wed Jul 1 22:12:52 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 999DC1065670 for ; Wed, 1 Jul 2009 22:12:52 +0000 (UTC) (envelope-from keramida@ceid.upatras.gr) Received: from poseidon.ceid.upatras.gr (poseidon.ceid.upatras.gr [150.140.141.169]) by mx1.freebsd.org (Postfix) with ESMTP id 188618FC14 for ; Wed, 1 Jul 2009 22:12:52 +0000 (UTC) (envelope-from keramida@ceid.upatras.gr) Received: from mail.ceid.upatras.gr (unknown [10.1.0.143]) by poseidon.ceid.upatras.gr (Postfix) with ESMTP id 420A3EB5325; Thu, 2 Jul 2009 01:12:48 +0300 (EEST) Received: from localhost (europa.ceid.upatras.gr [127.0.0.1]) by mail.ceid.upatras.gr (Postfix) with ESMTP id 3492F450D0; Thu, 2 Jul 2009 01:12:48 +0300 (EEST) X-Virus-Scanned: amavisd-new at ceid.upatras.gr Received: from mail.ceid.upatras.gr ([127.0.0.1]) by localhost (europa.ceid.upatras.gr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mZGT-Aq9wmId; Thu, 2 Jul 2009 01:12:48 +0300 (EEST) Received: from kobe.laptop (adsl177-174.kln.forthnet.gr [62.1.250.174]) by mail.ceid.upatras.gr (Postfix) with ESMTP id C342B4503F; Thu, 2 Jul 2009 01:12:47 +0300 (EEST) Received: from kobe.laptop (kobe.laptop [127.0.0.1]) by kobe.laptop (8.14.3/8.14.3) with ESMTP id n61L8MYS001112; Thu, 2 Jul 2009 00:08:22 +0300 (EEST) (envelope-from keramida@ceid.upatras.gr) Received: (from keramida@localhost) by kobe.laptop (8.14.3/8.14.3/Submit) id n61L7r29001111; Thu, 2 Jul 2009 00:07:53 +0300 (EEST) (envelope-from keramida@ceid.upatras.gr) From: Giorgos Keramidas To: Wojciech Puchar References: <755cb9fc0907011040o28b82cdbjd5760b139f797050@mail.gmail.com> <87tz1wqkmu.fsf@kobe.laptop> Date: Thu, 02 Jul 2009 00:07:53 +0300 In-Reply-To: (Wojciech Puchar's message of "Wed, 1 Jul 2009 22:02:48 +0200 (CEST)") Message-ID: <87k52saz86.fsf@kobe.laptop> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Alexandre Vieira , freebsd-questions@freebsd.org Subject: Re: scripting tip needed X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2009 22:12:52 -0000 On Wed, 1 Jul 2009 22:02:48 +0200 (CEST), Wojciech Puchar wrote: >> Using an interactive language like Python you can actually *test* the >> code as you are writing it. This is a major win most of the time. > > could you explain what you mean? You can and you have to test a code on > any language be it bash, ksh python or C Yes. I mean that one can directly interact with the interpret in a REPL prompt, doing stuff like: >>> import re >>> devre = re.compile(r'(/dev/\S+)\s+(\S+)\s.*$') >>> devre <_sre.SRE_Pattern object at 0x28462780> >>> devre.match('/dev/ad0s1d 1012974 390512 541426 42% /var') <_sre.SRE_Match object at 0x28432e78> >>> devre.match('/dev/ad0s1d 1012974 390512 541426 42% /var').groups() ('/dev/ad0s1d', '1012974') >>> devre = >>> re.compile(r'(/dev/\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+).*$') >>> devre.match('/dev/ad0s1d 1012974 390512 541426 42% /var').groups() ('/dev/ad0s1d', '1012974', '390512', '541426', '42%', '/var') See how I am 'refining' the initial regular expression without ever leaving the Python prompt? That sort of interactivity is entirely lost when you have to edit a file, save it, switch screen(1) windows or type ^Z to background the editor, run a script, watch it fail and repeat. Then I can keep testing bits and pieces of code: >>> from subprocess import Popen, PIPE >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout >>> for l in pipe: ... m = devre.match(l) ... if m: ... print "device %s, size %ld KB" % (m.group(1), long(m.group(2))) ... device /dev/ad0s1a, size 1012974 KB device /dev/ad0s1d, size 1012974 KB device /dev/ad0s1e, size 2026030 KB device /dev/ad0s1f, size 10154158 KB device /dev/ad0s1g, size 284455590 KB device /dev/md0, size 19566 KB >>> So piping df output to a Python bit of code works! That's nice. Then once I have a 'rough idea' of how I want the script to work, I can refactor a bit the repetitive bits: >>> def devsize(line): ... m = devre.match(line) ... if m: ... return (m.group(1), m.group(2)) ... >>> devsize('/dev/ad0s1d 1012974 390512 541426 42% /var') ('/dev/ad0s1d', '1012974') So here's a short function to return a nice 2-item tuple with two values (device name, number of 1 KB blocks). Can we pipe df output through it? >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout >>> map(devsize, pipe.readlines()) [ None, ('/dev/ad0s1a', '1012974'), None, ('/dev/ad0s1d', '1012974'), ('/dev/ad0s1e', '2026030'), ('/dev/ad0s1f', '10154158'), ('/dev/ad0s1g', '284455590'), None, None, None, None, None, None, None, None, None, None, None, None, None, None, ('/dev/md0', '19566'), None] >>> It looks we can do that too, but the tuple list may be more useful if we trim the null items in the process: >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout >>> [t for t in map(devsize, pipe.readlines()) if t] [ ('/dev/ad0s1a', '1012974'), ('/dev/ad0s1d', '1012974'), ('/dev/ad0s1e', '2026030'), ('/dev/ad0s1f', '10154158'), ('/dev/ad0s1g', '284455590'), ('/dev/md0', '19566') ] So there it is. A nice structure, supported by the core of the language, using a readable, easy syntax, and listing all the /dev nodes of my laptop along with their sizes in KBytes. The entire thing was built 'piece by piece', in the same Python session, and I now have not only a 'rough idea' of how the code should work, but also a working copy of the code in my history. Note the complete *lack* of care about how to append to a list, how to create dynamic pairs of devicename-size tuples, how to map all elements of a list through a function, and more importantly the complete and utter lack of any sort of '"${[]}"' quoting for variable names, values, nested expansions, and so on. That's what I am talking about. Shell scripts are nice, but if we are not constrained for some reason to use only /bin/sh or ksh, there's no excuse for wasting hours upon hours to decipher cryptic quoting rules and exceptional edge-cases of "black quoting magic", just to get a short job done. Being able to _easily_ use higher level structures than a plain 'stream of bytes' is nice :)