From owner-freebsd-questions@FreeBSD.ORG  Thu Sep 13 19:33:13 2007
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 64AD416A417
	for <questions@freebsd.org>; Thu, 13 Sep 2007 19:33:13 +0000 (UTC)
	(envelope-from crwhipp@gmail.com)
Received: from rv-out-0910.google.com (rv-out-0910.google.com [209.85.198.186])
	by mx1.freebsd.org (Postfix) with ESMTP id 25DA513C480
	for <questions@freebsd.org>; Thu, 13 Sep 2007 19:33:13 +0000 (UTC)
	(envelope-from crwhipp@gmail.com)
Received: by rv-out-0910.google.com with SMTP id l15so426319rvb
	for <questions@freebsd.org>; Thu, 13 Sep 2007 12:33:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta;
	h=domainkey-signature:received:received:received:received:message-id:in-reply-to:references:date:subject:from:to:cc:user-agent:mime-version:content-type:content-transfer-encoding:x-priority:importance;
	bh=PQPD9nqzYX5DJN1c1ZjQ0wIhxBkB82Sfm/zkSobc5hs=;
	b=KixXpwPrjXx3GtUNoqrsTgSxHf1GwRxjuTkPnz9KNieLlmLpLd33n+JpMiyGVghgMC160GEsoElidPZoKNlTd++VNzzpkXJqqt+B1uNszolb00Sf38oGumsM7fKAppW+m+NQy7YpbjO6f0l2qFBtoPM+WxiFaDD8JXyBX1jXsJ0=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta;
	h=received:message-id:in-reply-to:references:date:subject:from:to:cc:user-agent:mime-version:content-type:content-transfer-encoding:x-priority:importance;
	b=kFtjhN8V4xO3CrglbLV2NmsC9V+o1CsM1m2CwEV9V40s1+Dmqr0Or6/CBFPS8GngV4XyKsU7j2uJ9oJW285tjU5fnh8/JFQw2ntmB8zZuWvbl0+H85MoVR9VgfHE1+l43UgHOAHC6Hguxb0zhHUovBQ6lRKTK+LZLLVdKWwNpjY=
Received: by 10.141.171.6 with SMTP id y6mr439150rvo.1189710535218;
	Thu, 13 Sep 2007 12:08:55 -0700 (PDT)
Received: from mediabox.servebeer.com ( [130.13.179.137])
	by mx.google.com with ESMTPS id 3sm14397212rvi.2007.09.13.12.08.53
	(version=TLSv1/SSLv3 cipher=OTHER);
	Thu, 13 Sep 2007 12:08:54 -0700 (PDT)
Received: from whippsthroughlife.servebeer.com (localhost [127.0.0.1])
	by mediabox.servebeer.com (Postfix) with ESMTP id B15B81CC12;
	Thu, 13 Sep 2007 12:08:51 -0700 (MST)
Received: from 65.121.28.16 (SquirrelMail authenticated user cwhipp)
	by whippsthroughlife.servebeer.com with HTTP;
	Thu, 13 Sep 2007 12:08:51 -0700 (MST)
Message-ID: <62309.65.121.28.16.1189710531.squirrel@whippsthroughlife.servebeer.com>
In-Reply-To: <a9f4a3860709131119h2d7589aej59587749bb1fa2ef@mail.gmail.com>
References: <a9f4a3860709131016w54c12b6fy94fc2b0f286aea3d@mail.gmail.com>
	<20070913172001.GA78799@gizmo.acns.msu.edu>
	<a9f4a3860709131032q21bfefc2hf8d78cae53637576@mail.gmail.com>
	<20070913175510.GA78984@gizmo.acns.msu.edu>
	<a9f4a3860709131119h2d7589aej59587749bb1fa2ef@mail.gmail.com>
Date: Thu, 13 Sep 2007 12:08:51 -0700 (MST)
From: "Craig Whipp" <crwhipp@gmail.com>
To: "Kurt Buff" <kurt.buff@gmail.com>
User-Agent: SquirrelMail/1.4.10a
MIME-Version: 1.0
Content-Type: text/plain;charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Cc: Jerry McAllister <jerrymc@msu.edu>, questions@freebsd.org
Subject: Re: Scripting question
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Sep 2007 19:33:13 -0000

> On 9/13/07, Jerry McAllister <jerrymc@msu.edu> wrote:
>> > The only space is the one separating the SMTP address from the OK or
>> NO.
>>
>> Then you should be able to tell it to sort on the first token in
>> the string with white space as a separator and to eliminate
>> duplicates.   It has been a long time since I had need of sort. I
>> don't remember the arguments/flags but am sure that type of thing can be
>> done.
>>
>> ////jerry
>
> Ya know, it's really easy to get wrapped around the axle on this stuff.
>
> I think I may have a better solution. The file I'm trying to massage
> has a predecessor - the non-unique lines are the result of a
> concatenation of two files.
>
> Silly me, it's better to 'grep -v' with the one file vs. the second
> rather than trying to merge, sort and further massage the result. The
> fix will be to use sed against the first file to remove the ' NO',
> thus providing a clean argument for grepping the other file.
>
> Sigh.
>
> Kurt


It sounds like you've found your solution, but how about the below shell
script?  Probably woefully inefficient, but should work.

- Craig

########### begin script ##############
#!/bin/sh
# Read in an input list of 2 column data pairs and output the pairs where
the first columns are unique.

INPUT_FILE="list.txt"
OUTPUT_FILE="new_list.txt"
NON_UNIQ_LIST=""

for NON_UNIQ in `cat $INPUT_FILE | awk '{print $1}' | sort | uniq -c |
grep -vE '^ *1' | awk '{print $2}'`
do
	NON_UNIQ_LIST=$NON_UNIQ_LIST"|"$NON_UNIQ
done

NON_UNIQ_LIST=`echo $NON_UNIQ_LIST | sed 's/^.//'`

cat $INPUT_FILE | grep -vE $NON_UNIQ_LIST > $OUTPUT_FILE
########### end script ##############