From owner-freebsd-questions@FreeBSD.ORG  Wed Jul  1 22:12:52 2009
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 999DC1065670
	for <freebsd-questions@freebsd.org>;
	Wed,  1 Jul 2009 22:12:52 +0000 (UTC)
	(envelope-from keramida@ceid.upatras.gr)
Received: from poseidon.ceid.upatras.gr (poseidon.ceid.upatras.gr
	[150.140.141.169])
	by mx1.freebsd.org (Postfix) with ESMTP id 188618FC14
	for <freebsd-questions@freebsd.org>;
	Wed,  1 Jul 2009 22:12:52 +0000 (UTC)
	(envelope-from keramida@ceid.upatras.gr)
Received: from mail.ceid.upatras.gr (unknown [10.1.0.143])
	by poseidon.ceid.upatras.gr (Postfix) with ESMTP id 420A3EB5325;
	Thu,  2 Jul 2009 01:12:48 +0300 (EEST)
Received: from localhost (europa.ceid.upatras.gr [127.0.0.1])
	by mail.ceid.upatras.gr (Postfix) with ESMTP id 3492F450D0;
	Thu,  2 Jul 2009 01:12:48 +0300 (EEST)
X-Virus-Scanned: amavisd-new at ceid.upatras.gr
Received: from mail.ceid.upatras.gr ([127.0.0.1])
	by localhost (europa.ceid.upatras.gr [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id mZGT-Aq9wmId; Thu,  2 Jul 2009 01:12:48 +0300 (EEST)
Received: from kobe.laptop (adsl177-174.kln.forthnet.gr [62.1.250.174])
	by mail.ceid.upatras.gr (Postfix) with ESMTP id C342B4503F;
	Thu,  2 Jul 2009 01:12:47 +0300 (EEST)
Received: from kobe.laptop (kobe.laptop [127.0.0.1])
	by kobe.laptop (8.14.3/8.14.3) with ESMTP id n61L8MYS001112;
	Thu, 2 Jul 2009 00:08:22 +0300 (EEST)
	(envelope-from keramida@ceid.upatras.gr)
Received: (from keramida@localhost)
	by kobe.laptop (8.14.3/8.14.3/Submit) id n61L7r29001111;
	Thu, 2 Jul 2009 00:07:53 +0300 (EEST)
	(envelope-from keramida@ceid.upatras.gr)
From: Giorgos Keramidas <keramida@ceid.upatras.gr>
To: Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl>
References: <755cb9fc0907011040o28b82cdbjd5760b139f797050@mail.gmail.com>
	<87tz1wqkmu.fsf@kobe.laptop>
	<alpine.BSF.2.00.0907012202070.1817@wojtek.tensor.gdynia.pl>
Date: Thu, 02 Jul 2009 00:07:53 +0300
In-Reply-To: <alpine.BSF.2.00.0907012202070.1817@wojtek.tensor.gdynia.pl>
	(Wojciech Puchar's message of "Wed, 1 Jul 2009 22:02:48 +0200 (CEST)")
Message-ID: <87k52saz86.fsf@kobe.laptop>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (berkeley-unix)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Alexandre Vieira <nullpt@gmail.com>, freebsd-questions@freebsd.org
Subject: Re: scripting tip needed
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Jul 2009 22:12:52 -0000

On Wed, 1 Jul 2009 22:02:48 +0200 (CEST), Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl> wrote:
>> Using an interactive language like Python you can actually *test* the
>> code as you are writing it.  This is a major win most of the time.
>
> could you explain what you mean? You can and you have to test a code on
> any language be it bash, ksh python or C

Yes.  I mean that one can directly interact with the interpret in a REPL
prompt, doing stuff like:

    >>> import re
    >>> devre = re.compile(r'(/dev/\S+)\s+(\S+)\s.*$')
    >>> devre
    <_sre.SRE_Pattern object at 0x28462780>
    >>> devre.match('/dev/ad0s1d 1012974 390512 541426 42% /var')
    <_sre.SRE_Match object at 0x28432e78>
    >>> devre.match('/dev/ad0s1d 1012974 390512 541426 42% /var').groups()
    ('/dev/ad0s1d', '1012974')
    >>> devre =
    >>> re.compile(r'(/dev/\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+).*$')
    >>> devre.match('/dev/ad0s1d 1012974 390512 541426 42% /var').groups()
    ('/dev/ad0s1d', '1012974', '390512', '541426', '42%', '/var')

See how I am 'refining' the initial regular expression without ever
leaving the Python prompt?  That sort of interactivity is entirely lost
when you have to edit a file, save it, switch screen(1) windows or type
^Z to background the editor, run a script, watch it fail and repeat.

Then I can keep testing bits and pieces of code:

    >>> from subprocess import Popen, PIPE
    >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout
    >>> for l in pipe:
    ...     m = devre.match(l)
    ...     if m:
    ...         print "device %s, size %ld KB" % (m.group(1), long(m.group(2)))
    ...
    device /dev/ad0s1a, size 1012974 KB
    device /dev/ad0s1d, size 1012974 KB
    device /dev/ad0s1e, size 2026030 KB
    device /dev/ad0s1f, size 10154158 KB
    device /dev/ad0s1g, size 284455590 KB
    device /dev/md0, size 19566 KB
    >>>

So piping df output to a Python bit of code works!  That's nice.  Then
once I have a 'rough idea' of how I want the script to work, I can
refactor a bit the repetitive bits:

    >>> def devsize(line):
    ...     m = devre.match(line)
    ...     if m:
    ...         return (m.group(1), m.group(2))
    ...
    >>> devsize('/dev/ad0s1d 1012974 390512 541426 42% /var')
    ('/dev/ad0s1d', '1012974')

So here's a short function to return a nice 2-item tuple with two values
(device name, number of 1 KB blocks).  Can we pipe df output through it?

    >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout
    >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout
    >>> map(devsize, pipe.readlines())
    [ None, ('/dev/ad0s1a', '1012974'), None, ('/dev/ad0s1d', '1012974'),
      ('/dev/ad0s1e', '2026030'), ('/dev/ad0s1f', '10154158'),
      ('/dev/ad0s1g', '284455590'), None, None, None, None, None, None,
      None, None, None, None, None, None, None, None,
      ('/dev/md0', '19566'), None]
    >>>

It looks we can do that too, but the tuple list may be more useful if we
trim the null items in the process:

    >>> pipe = Popen(['df', '-k'], shell=False, stdout=PIPE).stdout
    >>> [t for t in map(devsize, pipe.readlines()) if t]
    [ ('/dev/ad0s1a', '1012974'), ('/dev/ad0s1d', '1012974'),
      ('/dev/ad0s1e', '2026030'), ('/dev/ad0s1f', '10154158'),
      ('/dev/ad0s1g', '284455590'), ('/dev/md0', '19566') ]

So there it is.  A nice structure, supported by the core of the
language, using a readable, easy syntax, and listing all the /dev nodes
of my laptop along with their sizes in KBytes.

The entire thing was built 'piece by piece', in the same Python session,
and I now have not only a 'rough idea' of how the code should work, but
also a working copy of the code in my history.

Note the complete *lack* of care about how to append to a list, how to
create dynamic pairs of devicename-size tuples, how to map all elements
of a list through a function, and more importantly the complete and
utter lack of any sort of '"${[]}"' quoting for variable names, values,
nested expansions, and so on.

That's what I am talking about.  Shell scripts are nice, but if we are
not constrained for some reason to use only /bin/sh or ksh, there's no
excuse for wasting hours upon hours to decipher cryptic quoting rules
and exceptional edge-cases of "black quoting magic", just to get a short
job done.  Being able to _easily_ use higher level structures than a
plain 'stream of bytes' is nice :)