Date: Thu, 13 Sep 2007 12:08:51 -0700 (MST) From: "Craig Whipp" <crwhipp@gmail.com> To: "Kurt Buff" <kurt.buff@gmail.com> Cc: Jerry McAllister <jerrymc@msu.edu>, questions@freebsd.org Subject: Re: Scripting question Message-ID: <62309.65.121.28.16.1189710531.squirrel@whippsthroughlife.servebeer.com> In-Reply-To: <a9f4a3860709131119h2d7589aej59587749bb1fa2ef@mail.gmail.com> References: <a9f4a3860709131016w54c12b6fy94fc2b0f286aea3d@mail.gmail.com> <20070913172001.GA78799@gizmo.acns.msu.edu> <a9f4a3860709131032q21bfefc2hf8d78cae53637576@mail.gmail.com> <20070913175510.GA78984@gizmo.acns.msu.edu> <a9f4a3860709131119h2d7589aej59587749bb1fa2ef@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 9/13/07, Jerry McAllister <jerrymc@msu.edu> wrote: >> > The only space is the one separating the SMTP address from the OK or >> NO. >> >> Then you should be able to tell it to sort on the first token in >> the string with white space as a separator and to eliminate >> duplicates. It has been a long time since I had need of sort. I >> don't remember the arguments/flags but am sure that type of thing can be >> done. >> >> ////jerry > > Ya know, it's really easy to get wrapped around the axle on this stuff. > > I think I may have a better solution. The file I'm trying to massage > has a predecessor - the non-unique lines are the result of a > concatenation of two files. > > Silly me, it's better to 'grep -v' with the one file vs. the second > rather than trying to merge, sort and further massage the result. The > fix will be to use sed against the first file to remove the ' NO', > thus providing a clean argument for grepping the other file. > > Sigh. > > Kurt It sounds like you've found your solution, but how about the below shell script? Probably woefully inefficient, but should work. - Craig ########### begin script ############## #!/bin/sh # Read in an input list of 2 column data pairs and output the pairs where the first columns are unique. INPUT_FILE="list.txt" OUTPUT_FILE="new_list.txt" NON_UNIQ_LIST="" for NON_UNIQ in `cat $INPUT_FILE | awk '{print $1}' | sort | uniq -c | grep -vE '^ *1' | awk '{print $2}'` do NON_UNIQ_LIST=$NON_UNIQ_LIST"|"$NON_UNIQ done NON_UNIQ_LIST=`echo $NON_UNIQ_LIST | sed 's/^.//'` cat $INPUT_FILE | grep -vE $NON_UNIQ_LIST > $OUTPUT_FILE ########### end script ##############
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?62309.65.121.28.16.1189710531.squirrel>