From owner-freebsd-questions@FreeBSD.ORG  Mon Jan  5 19:15:10 2004
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 698E316A4D0
	for <questions@freebsd.org>; Mon,  5 Jan 2004 19:15:10 -0800 (PST)
Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net
	[203.16.214.203])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A476E43D2F
	for <questions@freebsd.org>; Mon,  5 Jan 2004 19:15:08 -0800 (PST)
	(envelope-from malcolm.kay@internode.on.net)
Received: from beta.home (ppp129-234.lns1.adl2.internode.on.net
	[150.101.129.234])i063F5p5080587;
	Tue, 6 Jan 2004 13:45:05 +1030 (CST)
Content-Type: text/plain;
  charset="iso-8859-1"
From: Malcolm Kay <malcolm.kay@internode.on.net>
Organization: At home
To: Gautam Gopalakrishnan <ggop@madras.dyndns.org>
Date: Tue, 6 Jan 2004 13:45:04 +1030
User-Agent: KMail/1.4.3
References: <Law11-F31WerOc0Ne0P00016107@hotmail.com>
	<200401061230.42038.malcolm.kay@internode.on.net>
	<20040106022052.GA8122@madras.dyndns.org>
In-Reply-To: <20040106022052.GA8122@madras.dyndns.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Message-Id: <200401061345.04575.malcolm.kay@internode.on.net>
cc: Zhang Weiwu <weiwuzhang@hotmail.com>
cc: questions@freebsd.org
cc: zhangweiwu@realss.com
Subject: Re: help me with this sed expression
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Jan 2004 03:15:10 -0000

On Tue, 6 Jan 2004 12:50, Gautam Gopalakrishnan wrote:
> On Tue, Jan 06, 2004 at 12:30:42PM +1030, Malcolm Kay wrote:
> > On Mon, 5 Jan 2004 22:19, Zhang Weiwu wrote:
> > > Hello. I've worked an hour to figure out a serial of sed command to
> > > process some text (without any luck, you kown I'm kinda newbie). I
> > > really appreciate your help.
> > >
> > > The original text file is in this form -- for each line:
> > > one Chinese word then one or two English word seperated by space.
> > >
> > > I tried to do things like s/\(.*\)\([a-z]*\)/\2 \1/ but the first
> > > \(.*\) is too greedy and included the rest [a-z].
> >
> > Well the greedy part is easily fixed with:
> >   s/\([^a-z]*\)\([a-z]*\)/\2 \1/
> >
> > But this will not work for those lines with 2 english words. The
> > following should: % sed -n -e 's/\([^a-z]*\)\([a-z]*\) .*/\2 \1/p' -e
> > 's/\([^a-z]*\)[a-z]* \([a-z]*\)/\2 \1/p' original > target
>
> I think awk is easier:
>
> awk '{print $2 " " $3 " " $1}' original | tr -s > target

I'm not really very familiar with awk, but I must say this
is a much simpler and rather magical solution.

How does awk know which part of the original line goes into $1, $2 and $3=
=2E
(You will notice there is no space between the chinese and english words)=
=2E

I am also mystified how it generates two lines

  a ????
  av ????

from the input
  ????a av

Malcolm Kay