From owner-freebsd-questions@FreeBSD.ORG Thu Sep 13 19:33:13 2007 Return-Path: Delivered-To: questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64AD416A417 for ; Thu, 13 Sep 2007 19:33:13 +0000 (UTC) (envelope-from crwhipp@gmail.com) Received: from rv-out-0910.google.com (rv-out-0910.google.com [209.85.198.186]) by mx1.freebsd.org (Postfix) with ESMTP id 25DA513C480 for ; Thu, 13 Sep 2007 19:33:13 +0000 (UTC) (envelope-from crwhipp@gmail.com) Received: by rv-out-0910.google.com with SMTP id l15so426319rvb for ; Thu, 13 Sep 2007 12:33:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:received:received:message-id:in-reply-to:references:date:subject:from:to:cc:user-agent:mime-version:content-type:content-transfer-encoding:x-priority:importance; bh=PQPD9nqzYX5DJN1c1ZjQ0wIhxBkB82Sfm/zkSobc5hs=; b=KixXpwPrjXx3GtUNoqrsTgSxHf1GwRxjuTkPnz9KNieLlmLpLd33n+JpMiyGVghgMC160GEsoElidPZoKNlTd++VNzzpkXJqqt+B1uNszolb00Sf38oGumsM7fKAppW+m+NQy7YpbjO6f0l2qFBtoPM+WxiFaDD8JXyBX1jXsJ0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:in-reply-to:references:date:subject:from:to:cc:user-agent:mime-version:content-type:content-transfer-encoding:x-priority:importance; b=kFtjhN8V4xO3CrglbLV2NmsC9V+o1CsM1m2CwEV9V40s1+Dmqr0Or6/CBFPS8GngV4XyKsU7j2uJ9oJW285tjU5fnh8/JFQw2ntmB8zZuWvbl0+H85MoVR9VgfHE1+l43UgHOAHC6Hguxb0zhHUovBQ6lRKTK+LZLLVdKWwNpjY= Received: by 10.141.171.6 with SMTP id y6mr439150rvo.1189710535218; Thu, 13 Sep 2007 12:08:55 -0700 (PDT) Received: from mediabox.servebeer.com ( [130.13.179.137]) by mx.google.com with ESMTPS id 3sm14397212rvi.2007.09.13.12.08.53 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 13 Sep 2007 12:08:54 -0700 (PDT) Received: from whippsthroughlife.servebeer.com (localhost [127.0.0.1]) by mediabox.servebeer.com (Postfix) with ESMTP id B15B81CC12; Thu, 13 Sep 2007 12:08:51 -0700 (MST) Received: from 65.121.28.16 (SquirrelMail authenticated user cwhipp) by whippsthroughlife.servebeer.com with HTTP; Thu, 13 Sep 2007 12:08:51 -0700 (MST) Message-ID: <62309.65.121.28.16.1189710531.squirrel@whippsthroughlife.servebeer.com> In-Reply-To: References: <20070913172001.GA78799@gizmo.acns.msu.edu> <20070913175510.GA78984@gizmo.acns.msu.edu> Date: Thu, 13 Sep 2007 12:08:51 -0700 (MST) From: "Craig Whipp" To: "Kurt Buff" User-Agent: SquirrelMail/1.4.10a MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: Jerry McAllister , questions@freebsd.org Subject: Re: Scripting question X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Sep 2007 19:33:13 -0000 > On 9/13/07, Jerry McAllister wrote: >> > The only space is the one separating the SMTP address from the OK or >> NO. >> >> Then you should be able to tell it to sort on the first token in >> the string with white space as a separator and to eliminate >> duplicates. It has been a long time since I had need of sort. I >> don't remember the arguments/flags but am sure that type of thing can be >> done. >> >> ////jerry > > Ya know, it's really easy to get wrapped around the axle on this stuff. > > I think I may have a better solution. The file I'm trying to massage > has a predecessor - the non-unique lines are the result of a > concatenation of two files. > > Silly me, it's better to 'grep -v' with the one file vs. the second > rather than trying to merge, sort and further massage the result. The > fix will be to use sed against the first file to remove the ' NO', > thus providing a clean argument for grepping the other file. > > Sigh. > > Kurt It sounds like you've found your solution, but how about the below shell script? Probably woefully inefficient, but should work. - Craig ########### begin script ############## #!/bin/sh # Read in an input list of 2 column data pairs and output the pairs where the first columns are unique. INPUT_FILE="list.txt" OUTPUT_FILE="new_list.txt" NON_UNIQ_LIST="" for NON_UNIQ in `cat $INPUT_FILE | awk '{print $1}' | sort | uniq -c | grep -vE '^ *1' | awk '{print $2}'` do NON_UNIQ_LIST=$NON_UNIQ_LIST"|"$NON_UNIQ done NON_UNIQ_LIST=`echo $NON_UNIQ_LIST | sed 's/^.//'` cat $INPUT_FILE | grep -vE $NON_UNIQ_LIST > $OUTPUT_FILE ########### end script ##############