From owner-freebsd-questions@FreeBSD.ORG Fri Sep 14 07:42:40 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8CCF616A417 for ; Fri, 14 Sep 2007 07:42:40 +0000 (UTC) (envelope-from iaccounts@ibctech.ca) Received: from pearl.ibctech.ca (pearl.ibctech.ca [208.70.104.210]) by mx1.freebsd.org (Postfix) with ESMTP id 3EC6B13C458 for ; Fri, 14 Sep 2007 07:42:40 +0000 (UTC) (envelope-from iaccounts@ibctech.ca) Received: (qmail 55278 invoked by uid 1002); 14 Sep 2007 07:42:39 -0000 Received: from iaccounts@ibctech.ca by pearl.ibctech.ca by uid 89 with qmail-scanner-1.22 (spamassassin: 2.64. Clear:RC:1(208.70.104.100):. Processed in 11.111556 secs); 14 Sep 2007 07:42:39 -0000 Received: from unknown (HELO ?192.168.30.110?) (steve@ibctech.ca@208.70.104.100) by pearl.ibctech.ca with (DHE-RSA-AES256-SHA encrypted) SMTP; 14 Sep 2007 07:42:27 -0000 Message-ID: <46EA3B6C.7050200@ibctech.ca> Date: Fri, 14 Sep 2007 03:42:36 -0400 From: Steve Bertrand User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: Jonathan McKeown References: <20070913183504.GC11683@slackbox.xs4all.nl> <200709140930.21142.jonathan+freebsd-questions@hst.org.za> In-Reply-To: <200709140930.21142.jonathan+freebsd-questions@hst.org.za> X-Enigmail-Version: 0.95.3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Kurt Buff , freebsd-questions@freebsd.org Subject: Re: Scripting question X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Sep 2007 07:42:40 -0000 >>> I don't have the perl skills, though that would be ideal. -- snip -- > Another approach in Perl would be: > > #!/usr/bin/perl > my (%names, %dups); > while (<>) { > my ($key) = split; > $dups{$key} = 1 if $names{$key}; > $names{$key} = 1; > } > delete @names{keys %dups}; > # > # keys %names is now an unordered list of only non-repeated elements > # keys %dups is an unordered list of only repeated elements > > split splits on whitespace, returning a list of fields which can be assigned > to a list of variables. Here we only want to capture the first field: split > is more efficient for this than using a regex. The first occurrence of $key > is in parens because it's actually a list of one variable name. > > We build two hashes, one, %name, keyed by the original names (this is the > classic way to reduce duplicates to single occurrences, since the duplicated > keys overwrite the originals), and one, %dup, whose keys are names already > appearing in %names - the duplicated entries. Having done that we use a hash > slice to delete from %names all the keys of %dups, which leaves the keys of > %names holding all the entries which only appear once (and the keys of %dups > all the duplicated entries if that's useful). I don't know if this is completely relevant, but it appears as though it may help. Bob Showalter once advised me on the Perl Beginners list as such, quoted, but snipped for clarity: see "perldoc -q duplicate" If the array elements can be compared with string semantics (as you are doing here), the following will work: my @array = do { my %seen; grep !$seen{$_}++, @clean }; Steve