From owner-freebsd-questions@FreeBSD.ORG  Fri Sep 14 07:42:40 2007
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8CCF616A417
	for <freebsd-questions@freebsd.org>;
	Fri, 14 Sep 2007 07:42:40 +0000 (UTC)
	(envelope-from iaccounts@ibctech.ca)
Received: from pearl.ibctech.ca (pearl.ibctech.ca [208.70.104.210])
	by mx1.freebsd.org (Postfix) with ESMTP id 3EC6B13C458
	for <freebsd-questions@freebsd.org>;
	Fri, 14 Sep 2007 07:42:40 +0000 (UTC)
	(envelope-from iaccounts@ibctech.ca)
Received: (qmail 55278 invoked by uid 1002); 14 Sep 2007 07:42:39 -0000
Received: from iaccounts@ibctech.ca by pearl.ibctech.ca by uid 89 with
	qmail-scanner-1.22 
	(spamassassin: 2.64.  Clear:RC:1(208.70.104.100):. 
	Processed in 11.111556 secs); 14 Sep 2007 07:42:39 -0000
Received: from unknown (HELO ?192.168.30.110?)
	(steve@ibctech.ca@208.70.104.100)
	by pearl.ibctech.ca with (DHE-RSA-AES256-SHA encrypted) SMTP;
	14 Sep 2007 07:42:27 -0000
Message-ID: <46EA3B6C.7050200@ibctech.ca>
Date: Fri, 14 Sep 2007 03:42:36 -0400
From: Steve Bertrand <iaccounts@ibctech.ca>
User-Agent: Thunderbird 2.0.0.6 (Windows/20070728)
MIME-Version: 1.0
To: Jonathan McKeown <jonathan+freebsd-questions@hst.org.za>
References: <a9f4a3860709131016w54c12b6fy94fc2b0f286aea3d@mail.gmail.com>	<20070913183504.GC11683@slackbox.xs4all.nl>
	<200709140930.21142.jonathan+freebsd-questions@hst.org.za>
In-Reply-To: <200709140930.21142.jonathan+freebsd-questions@hst.org.za>
X-Enigmail-Version: 0.95.3
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: Kurt Buff <kurt.buff@gmail.com>, freebsd-questions@freebsd.org
Subject: Re: Scripting question
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Sep 2007 07:42:40 -0000


>>> I don't have the perl skills, though that would be ideal.

-- snip --

> Another approach in Perl would be:
> 
> #!/usr/bin/perl
> my (%names, %dups);
> while (<>) {
>     my ($key) = split;
>     $dups{$key} = 1 if $names{$key};
>     $names{$key} = 1;
> }
> delete @names{keys %dups};
> #
> # keys %names is now an unordered list of only non-repeated elements
> # keys %dups is an unordered list of only repeated elements
> 
> split splits on whitespace, returning a list of fields which can be assigned 
> to a list of variables. Here we only want to capture the first field: split 
> is more efficient for this than using a regex. The first occurrence of $key 
> is in parens because it's actually a list of one variable name.
> 
> We build two hashes, one, %name, keyed by the original names (this is the 
> classic way to reduce duplicates to single occurrences, since the duplicated 
> keys overwrite the originals), and one, %dup, whose keys are names already 
> appearing in %names - the duplicated entries. Having done that we use a hash 
> slice to delete from %names all the keys of %dups, which leaves the keys of 
> %names holding all the entries which only appear once (and the keys of %dups 
> all the duplicated entries if that's useful).

I don't know if this is completely relevant, but it appears as though it
 may help.

Bob Showalter once advised me on the Perl Beginners list as such,
quoted, but snipped for clarity:

see "perldoc -q duplicate" If the array elements can
be compared with string semantics (as you are doing here), the following
will work:

   my @array = do { my %seen; grep !$seen{$_}++, @clean };

Steve