From owner-freebsd-questions@FreeBSD.ORG Mon Jan 5 19:15:10 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 698E316A4D0 for ; Mon, 5 Jan 2004 19:15:10 -0800 (PST) Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203]) by mx1.FreeBSD.org (Postfix) with ESMTP id A476E43D2F for ; Mon, 5 Jan 2004 19:15:08 -0800 (PST) (envelope-from malcolm.kay@internode.on.net) Received: from beta.home (ppp129-234.lns1.adl2.internode.on.net [150.101.129.234])i063F5p5080587; Tue, 6 Jan 2004 13:45:05 +1030 (CST) Content-Type: text/plain; charset="iso-8859-1" From: Malcolm Kay Organization: At home To: Gautam Gopalakrishnan Date: Tue, 6 Jan 2004 13:45:04 +1030 User-Agent: KMail/1.4.3 References: <200401061230.42038.malcolm.kay@internode.on.net> <20040106022052.GA8122@madras.dyndns.org> In-Reply-To: <20040106022052.GA8122@madras.dyndns.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Message-Id: <200401061345.04575.malcolm.kay@internode.on.net> cc: Zhang Weiwu cc: questions@freebsd.org cc: zhangweiwu@realss.com Subject: Re: help me with this sed expression X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2004 03:15:10 -0000 On Tue, 6 Jan 2004 12:50, Gautam Gopalakrishnan wrote: > On Tue, Jan 06, 2004 at 12:30:42PM +1030, Malcolm Kay wrote: > > On Mon, 5 Jan 2004 22:19, Zhang Weiwu wrote: > > > Hello. I've worked an hour to figure out a serial of sed command to > > > process some text (without any luck, you kown I'm kinda newbie). I > > > really appreciate your help. > > > > > > The original text file is in this form -- for each line: > > > one Chinese word then one or two English word seperated by space. > > > > > > I tried to do things like s/\(.*\)\([a-z]*\)/\2 \1/ but the first > > > \(.*\) is too greedy and included the rest [a-z]. > > > > Well the greedy part is easily fixed with: > > s/\([^a-z]*\)\([a-z]*\)/\2 \1/ > > > > But this will not work for those lines with 2 english words. The > > following should: % sed -n -e 's/\([^a-z]*\)\([a-z]*\) .*/\2 \1/p' -e > > 's/\([^a-z]*\)[a-z]* \([a-z]*\)/\2 \1/p' original > target > > I think awk is easier: > > awk '{print $2 " " $3 " " $1}' original | tr -s > target I'm not really very familiar with awk, but I must say this is a much simpler and rather magical solution. How does awk know which part of the original line goes into $1, $2 and $3= =2E (You will notice there is no space between the chinese and english words)= =2E I am also mystified how it generates two lines a ???? av ???? from the input ????a av Malcolm Kay