Date: Mon, 05 Jan 2004 19:49:43 +0800 From: "Zhang Weiwu" <weiwuzhang@hotmail.com> To: questions@freebsd.org Subject: help me with this sed expression Message-ID: <Law11-F31WerOc0Ne0P00016107@hotmail.com>
next in thread | raw e-mail | index | archive | help
Hello. I've worked an hour to figure out a serial of sed command to process some text (without any luck, you kown I'm kinda newbie). I really appreciate your help. The original text file is in this form -- for each line: one Chinese word then one or two English word seperated by space. I wish to change to: 1) target file: one English word, then a space, then a Chinese word coorisponding to that English word. 2) if in the original file one Chinese word has more than one English word following in the same line, repeat the Chinese word to satisfy 1). Define: Chinese word = one or more continous bytes of data where each byte is greater then 128 in value. (it is true in GB2312 Chinese charset which this email is written in.) Define: English word = one or more continous bytes of [a-z]. Say, for the original file: =========== 一a av 可歌可泣aaav 无可奉告aacm =========== The target file should be: =========== a 一 av 一 aaav 可歌可泣 aacm 无可奉告 =========== I tried to do things like s/\(.*\)\([a-z]*\)/\2 \1/ but the first \(.*\) is too greedy and included the rest [a-z]. Thank you. _________________________________________________________________ 免费下载 MSN Explorer: http://explorer.msn.com/lccn
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Law11-F31WerOc0Ne0P00016107>