Date: Sat, 27 Jun 2009 20:49:21 +0800 From: Zhang Weiwu <zhangweiwu@realss.com> To: FreeBSD Questions mailing list <freebsd-questions@freebsd.org> Subject: scripting suggestion: how to make this command shorter Message-ID: <4A461551.7090702@realss.com>
next in thread | raw e-mail | index | archive | help
Hello. I wrote this one-line command to fetch a page from a long uri, parse it twice: first time get subject & second time get content, and send it as email to me. $ w3m -dump 'http://search1.taobao.com/browse/33/n-g,w6y4zzjaxxymvjomxy----------------40--commend-0-all-33.htm?at_topsearch=1&ssid=e-s5' | grep -A 100 对比 | mail -a 'Content-Type: text/plain; charset=UTF-8' -s '=?UTF-8?B?'`w3m -dump 'http://search1.taobao.com/browse/33/n-g,w6y4zzjaxxymvjomxy----------------40--commend-0-all-33.htm?at_topsearch=1&ssid=e-s5' | grep 找到.*件 | base64 -w0`'?=' zhangweiwu@realss.com The stupid part of this script is it fetches the page 2 times and parse 2 times, thus making the command very long. If I can write the command in a way that the URI only appear once, then it is easier for me to maintain it. I plan to put it in cron yet avoid having to modify two places when the URI changes (and it does!). How do you suggest optimizing the one-liner? By the way I feel it stupid having to wrap the subject by using: $ mail -s '=?UTF-8?B?'`echo $subject | base64`'?=' instead of $ mail -s $subject Because mail(1), as defined, intelligent user agent, should know the current locale is UTF-8 and should know UTF-8 header must be base64 encoded for RFC compatibility. Yet it also should know if mail body is UTF-8 the header 'Content-Type: text/plain; charset=UTF-8' must not be omitted in case of UTF-8 content. I think this is a bug, as both are required by RFC. How do you think?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A461551.7090702>