FreeBSD Mail Archives

Date:      Tue, 6 Jan 2004 13:45:04 +1030
From:      Malcolm Kay <malcolm.kay@internode.on.net>
To:        Gautam Gopalakrishnan <ggop@madras.dyndns.org>
Cc:        zhangweiwu@realss.com
Subject:   Re: help me with this sed expression
Message-ID:  <200401061345.04575.malcolm.kay@internode.on.net>
In-Reply-To: <20040106022052.GA8122@madras.dyndns.org>
References:  <Law11-F31WerOc0Ne0P00016107@hotmail.com> <200401061230.42038.malcolm.kay@internode.on.net> <20040106022052.GA8122@madras.dyndns.org>

index | next in thread | previous in thread | raw e-mail


On Tue, 6 Jan 2004 12:50, Gautam Gopalakrishnan wrote:
> On Tue, Jan 06, 2004 at 12:30:42PM +1030, Malcolm Kay wrote:
> > On Mon, 5 Jan 2004 22:19, Zhang Weiwu wrote:
> > > Hello. I've worked an hour to figure out a serial of sed command to
> > > process some text (without any luck, you kown I'm kinda newbie). I
> > > really appreciate your help.
> > >
> > > The original text file is in this form -- for each line:
> > > one Chinese word then one or two English word seperated by space.
> > >
> > > I tried to do things like s/\(.*\)\([a-z]*\)/\2 \1/ but the first
> > > \(.*\) is too greedy and included the rest [a-z].
> >
> > Well the greedy part is easily fixed with:
> >   s/\([^a-z]*\)\([a-z]*\)/\2 \1/
> >
> > But this will not work for those lines with 2 english words. The
> > following should: % sed -n -e 's/\([^a-z]*\)\([a-z]*\) .*/\2 \1/p' -e
> > 's/\([^a-z]*\)[a-z]* \([a-z]*\)/\2 \1/p' original > target
>
> I think awk is easier:
>
> awk '{print $2 " " $3 " " $1}' original | tr -s > target

I'm not really very familiar with awk, but I must say this
is a much simpler and rather magical solution.

How does awk know which part of the original line goes into $1, $2 and $3.
(You will notice there is no space between the chinese and english words).

I am also mystified how it generates two lines

  a ????
  av ????

from the input
  ????a av

Malcolm Kay

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200401061345.04575.malcolm.kay>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation