Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Sep 2007 12:37:17 -0700
From:      "Kurt Buff" <kurt.buff@gmail.com>
To:        "Craig Whipp" <crwhipp@gmail.com>
Cc:        questions@freebsd.org
Subject:   Re: Scripting question
Message-ID:  <a9f4a3860709131237h69f9741fuf99436635029ae9d@mail.gmail.com>
In-Reply-To: <62309.65.121.28.16.1189710531.squirrel@whippsthroughlife.servebeer.com>
References:  <a9f4a3860709131016w54c12b6fy94fc2b0f286aea3d@mail.gmail.com> <20070913172001.GA78799@gizmo.acns.msu.edu> <a9f4a3860709131032q21bfefc2hf8d78cae53637576@mail.gmail.com> <20070913175510.GA78984@gizmo.acns.msu.edu> <a9f4a3860709131119h2d7589aej59587749bb1fa2ef@mail.gmail.com> <62309.65.121.28.16.1189710531.squirrel@whippsthroughlife.servebeer.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 9/13/07, Craig Whipp <crwhipp@gmail.com> wrote:
> > On 9/13/07, Jerry McAllister <jerrymc@msu.edu> wrote:
> >> > The only space is the one separating the SMTP address from the OK or
> >> NO.
> >>
> >> Then you should be able to tell it to sort on the first token in
> >> the string with white space as a separator and to eliminate
> >> duplicates.   It has been a long time since I had need of sort. I
> >> don't remember the arguments/flags but am sure that type of thing can be
> >> done.
> >>
> >> ////jerry
> >
> > Ya know, it's really easy to get wrapped around the axle on this stuff.
> >
> > I think I may have a better solution. The file I'm trying to massage
> > has a predecessor - the non-unique lines are the result of a
> > concatenation of two files.
> >
> > Silly me, it's better to 'grep -v' with the one file vs. the second
> > rather than trying to merge, sort and further massage the result. The
> > fix will be to use sed against the first file to remove the ' NO',
> > thus providing a clean argument for grepping the other file.
> >
> > Sigh.
> >
> > Kurt
>
>
> It sounds like you've found your solution, but how about the below shell
> script?  Probably woefully inefficient, but should work.
>
> - Craig
>
> ########### begin script ##############
> #!/bin/sh
> # Read in an input list of 2 column data pairs and output the pairs where
> the first columns are unique.
>
> INPUT_FILE="list.txt"
> OUTPUT_FILE="new_list.txt"
> NON_UNIQ_LIST=""
>
> for NON_UNIQ in `cat $INPUT_FILE | awk '{print $1}' | sort | uniq -c |
> grep -vE '^ *1' | awk '{print $2}'`
> do
>         NON_UNIQ_LIST=$NON_UNIQ_LIST"|"$NON_UNIQ
> done
>
> NON_UNIQ_LIST=`echo $NON_UNIQ_LIST | sed 's/^.//'`
>
> cat $INPUT_FILE | grep -vE $NON_UNIQ_LIST > $OUTPUT_FILE
> ########### end script ##############


I'll fiddle with this too, but I like the perl better.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a9f4a3860709131237h69f9741fuf99436635029ae9d>